acauaennrn iene S| 
Optimal 
Portfolio 
Modeling 


Contents 


Preface 
Acknowledgments 


Modeling Market Microstructure—Randomness 
in Markets 


The Random Walk Model 

What You Cannot Predict Is Random to You 
Market Microstructure 

Efficient Market Hypothesis 

Arbitrage Pricing Theory 


Distribution of Price Changes 


The Normal Distribution 

Reflection Principle 

Approximation of the Normal Distribution 
by Rational Polynomial 

Lognormal Distribution 

Symmetry of the Normal and Lognormal 

Why Pick a Distribution at All? 

The Empirical Distribution 

The Lognormal as an Approximation 


xiii 


vii 


Vili 


CHAPTER 3 


CHAPTER 4 


CHAPTER 5 


CONTENTS 
Invesiment Objectives 29 
Statistician’s Fair Game 29 
A Fair Game Is a Loser! 30 
Criteria for a Favorable Game 30 
Gambler’s Ruin 31 
Optimal Return Models 32 
Markets Are Rational, Psychologists Are Not 34 
The St. Petersburg Paradox 36 
Compounded Return Is the Real Objective 37 
Defining Risk 38 
Minimum Risk Models 41 
Correlation of Assets 41 
Summary of Correlation Relationships 42 
Beta and Alpha 43 
The Efficient Frontier and the Market Portfolio 46 
The Sharpe Ratio a7 
Limitations of Modern Portfolio Theory 48 
Modeling Risk Management and Stop-loss Myths 51 
Stop-loss Orders 52 
Stops: Effect on the Mean Return 53 
Stops: Effect on the Probability of Gain 56 
Stops: Probability of Being Stopped Out 56 
Stops: Effect on Variance and Standard Deviation 58 
Effect on Skew 59 
Effect on the Kurtosis 60 
Stop-loss: Summary 61 
Modeling Stops 61 
Identifying When to Use Stops and When Not To 62 
Stop-Profits 64 
Puts and Calls 65 
Maximal Compounded Return Model 67 
Optimal Compound Return Models 68 
Relative Returns 68 


Average Stock Returns, but Compound Portfolio Returns 


Logarithms and the Optimal Exponential Growth Model 
Position Sizing as the Only Guaranteed Risk Control 
Controlling Risk through Optimal Position 
Maximize Compounded Portfolio Return 
Maximal Compounded Return Models 
What the Model Is and Is Not 

Modeling the Empirical Distribution 
Correlations 

The Enhanced Maximum Investment Formulas 
Expected Drawdowns May Be Large 


Utility Models—Preferences Toward Risk and Return 


Basis for a Utility Model 

History of Logarithms 

Optimal Compounded Utility Model 
The Sharpe Ratio 

Optimal Model for the Sharpe Ratio 
Optimization with Excel Solver 


Money Management Formulas Using the Joint 
Mulltiasset Distribution 


The Continuous Theoretical Distributions 
Maximal Log Log Model in the Presence of Correlation 
Optimal Sharpe Model with Correlation 
The Empirical Distribution 
Maximal Log Log Model in the Presence of Correlation 
Maximizing the Sharpe Ratio in the Presence 

of Correlation 


Proper Backlesting for Portfolio Models 


Assuring Good Data 

Synchronize Data 

Use Net Changes Not Levels 

Only Use Information from the Past 

Predictive Studies versus Nonpredictive Studies 


101 


102 
102 
103 
104 
106 


CHAPTER 9 


Use Intraday Highs and Lows for Model Accuracy 

Adjusted Data May Be Erroneous 

Adjusting Your Own Data 

Miscellaneous Data Pitfalls 

Tabulate and Save the Detailed Results with Dates 

Overlapping Dates Are Important for Correlations 

Calculate Mean, Standard Deviation, Variance, and 
Probability of Win 

Robust Methods to Find Statistics 

Confidence Limits for Robust Statistics 


The Combined Optimal Portfolio Model 


Choosing the Theoretical Distribution 

The Empirical Distribution 

Selecting Sharpe versus a Log Log Objective Function 
Model Simulation 

Professional Money Manager versus Private Investor 


ABOUT THE CD-ROM 


APPENDIX 1 


APPENDIX 2 


APPENDIX 3 


APPENDIX 4 


Introduction 

System Requirements 

What's on the CD 

Updates to the CD-ROM 

Customer Care 

Table of Values of the Normal Distribution 
Installing R 


Intreduction to R 


R Language Definition 


Index 


CONTENTS 


295 


uccessful investing is more than just picking the right stock. It is not just about tim- 
ing the market. It requires two essential ingredients. The first ingredient is having a 
winning edge. The author’s motto is: In order to beat the market, you need an edge. 
let your money leave home without it. 

‘However, that is not enough. Even if one has a winning edge, it is entirely possible to 
too recklessly or too conservatively and not achieve your objectives. Even with a 
system, excessive position sizes can destroy a trade just as surely as lack of an 
Failure to take adequate risks can also lead to underperformance. 

‘The purpose of this book is to show how to achieve the right balance of position 
and risk management so as to achieve the investment goals. Many books have 
how to have a winning edge. That is not the purview of this work. Rather, this 
focuses on the relatively unexplored realm of money management and portfolio 
Managing a portfolio through position sizing is at least as important as finding 
# maintaining an investment edge. 
Optimal Portfolio Modeling provides an introduction to the statistical properties 
in the early chapters. The book is oriented to intelligent people who may not 
time rocket scientists. The author resisted the very strong temptation to title this 
Rocket Science for Average Folks. Instead, the seemingly more dignified title 
chosen. 

‘Nevertheless, this work is designed to be very accessible to all, even with a limited 
background. Only high school algebra is required to understand this book. Readers 
advanced technical degree may be astonished to find that calculus is not required. 
‘may argue that optimization requires the use of calculus. This is true. However, 
=| knows how to do the calculus and so does the statistical language R. Users do not 
to mow the math in order to understand the result. Only basic high school algebra 


tools know how to do the magic. The user merely needs to know how to invoke 
‘spell and how to interpret the results. This book assumes the reader has some 
level of knowledge of Excel. The text fully explains how to use the built-in 
which allows the user to optimize models. 


xi 


Nii PREFACE 


For readers interested in the statistical language R, advanced users will find this 
work to be an augmentation of their knowledge, with special emphasis on portfolio mod- 
eling and optimization problems. Beginning users of the open-source language R will 
appreciate the fact that the appendices and CD offer both a tutorial and introduction to 
R targeted to the beginning user. Additionally, the text identifies those functions that are 
appropriate for optimization in the powerful R library. 

An important part of setting up a portfolio model is defining the objective or goal 
that one wishes to achieve. A significant portion of this book is devoted to the question 
of what is the right goal for the investor to seek, Often, this book refers to this as the util- 
ity function. Not all investors will or should choose the same objective function. The text 
discusses how to choose an appropriate objective function. A manager who is bench- 
marked to the Sharpe Ratio should choose that as the objective function. However, for 
more typical investors, the book strongly argues that maximizing long-term compounded 
growth of wealth should be an important component of the overall model. An innovative 
formula is provided that optimizes long-term wealth as well as provides a measure of 
stability not seen in other optimal money management formulas published to date. 


the field. In particular, this work is based on the breakthroughs and developments 

of countless researchers in the fields of finance and statistics. Although they are 
*oo numerous to name in this space, the author wishes to acknowledge the foundation 
‘that they have built in the field. This book would not have been possible without the 
ants who preceded us, many of whom are named in the text. 

Every person owes a debt of gratitude to those who have taught them throughout. 
their lives. In this regard, I wish to acknowledge and thank my parents, who were my 
‘Sst teachers. They encouraged and inspired my endeavors. 

The efforts and dedication of the many teachers and even fellow students who have 
Selped in my education must be gratefully acknowledged. Although too numerous to 
same individually, the cumulative contribution of their efforts and generosity with their 
knowledge has been of inestimable value in my personal development. Naturally, it has 
been a requisite foundation, essential to this work. 

There is one particular teacher I met as a professor of finance when we were both 
associated with the University of California at Berkeley. He is Victor Niederhoffer, who 
has enjoyed a meteoric speculative career. Prior to meeting him, I had the fuzzy notion 
that the best approach to the markets was through a rigorous quantitative methodology. 
At that time, the random walk ruled the thinking in finance. So there was little call for 
eccentric academics who thought they could beat the market, quantitatively or any other 
way. Professor Niederhoffer was just such a divergent thinker. 

His help and guidance taught me to see things at their simplest. That is the essence 
of his approach, His enlightenment also helped me to learn how to avoid the numerous 
pitfalls that can arise in quantitative studies. In fact, one of the things he taught me was 
what not to do on a quantitative study. Perhaps more than anything, working with him 
inspired and motivated me to further my efforts to study the markets from a quantitative 
perspective. His inspiration has lasted a lifetime. 

One must also acknowledge the hundreds of friends known as the Spec List who 
have stimulated my thinking, inspired my work, and helped me in so many ways. Victor 


A ny work such as this builds on the advances and ideas of so many previous giants in 


xiii 


xiv ACKNOWLEDGMENTS 


was the founder of the group and remains its chair. We all communicate daily to discuss 
markets with a quantitative focus. Unquestionably, the help and support of this remark- 
ably intelligent but diverse group has been a source of inspiration to me. Many of the 
ideas herein are amplifications of some of the ideas I have discussed more briefly with 
the group. Their collective comments, questions, and debate have helped to refine these 
ideas. Several of my friends in the group have encouraged me to write this book, and 
certainly this must be acknowledged. 

Perhaps most directly, I must acknowledge the very helpful people at Wiley who 
were directly responsible for helping with this work. Foremost of those is Pamela van 
Giessen who edited this work. Her experienced guidance and advice were critical in 
shaping this book. Her encouragement to become an author was the final impetus that 
helped to launch this project. 

Finally, I must thank Kate Wood and Jennifer MacDonald of Wiley, who were so 
helpful in supporting and editing this book. There are also many other people at Wiley 
whose names I do not even know who have contributed to this effort. I am very grateful 
to all of them. 

Even though many have contributed to this work, ultimately, any errors are mine. 


with advanced degrees in math and science. On Wall Street, such people are com- 

monly called rocket scientists. Optimal Portfolio Modeling was written to provide 
an easily accessible introduction to portfolio modeling for readers who prefer an intu- 
itive approach. This book can be read by the average intelligent person who has only a 
modest high school math background. It is designed for people who wish to understand 
rocket science with a minimum of math. 

The focus of this book is on money management. It is not a book about market tim- 
ing, nor is it designed to help you pick stocks. There are numerous other books that 
address those subjects. Rather, this work will show the reader how to define models to 
help manage money and control risk. Stock selection is really just the details. The big 
picture is actually about achieving your overall portfolio goals. 

Included with this book is a CD-ROM that includes numerous examples in both Ex- 
cel and R, the statistical modeling language. The book assumes the user has a beginner's 
level knowledge of Excel and focuses mainly on those specific areas that apply to portfo- 
lio modeling and optimization. There are many books that offer an introduction to Excel, 
and the interested reader is encouraged to investigate those. 

Risan open-source language that offers powerful graphics and statistics capabilities. 
Two appendices in this book offer introductory support for users who wish to download 
Rat no cost and learn how to program. Because Ris powerful, many functions and graphs 
can be done with very few command lines. Often, only a single line will create a graph or 
perform a statistical analysis. 


fe portfolio modeling has been the domain of highly quantitative people 


2 OPTIMAL PORTFOLIO MODELING 


The overriding philosophy of all of the examples is simplicity and ease of under- 
standing. Consequently, each example typically focuses on a single simple problem or 
calculation. It is the job of the computer to know how to perform the calculations. The 
user only needs to know how to invoke the right computer function and to understand 
the results. Understanding and intuition are the primary goals of this book. 

This chapter introduces the important background of market microstructure and 
randomness. This is a foundation for the ideas developed later in this book. The discus- 
sion starts with a thorough introduction to the idea of randomness and what a random 
walk is. The topic of randomness is presented as an essential element in understanding 
how and why a portfolio works. After all, the primary rationale for a portfolio is intelli- 
gent diversification. 

From there, the book moves to a discussion of market microstructure and how it 
affects the operation of markets. Later, the reader is introduced to the efficient market 
hypothesis, along with its history and development, starting with early pioneers in the 
field. Augmenting this is the discussion on arbitrage pricing theory and its modern ap- 
plications. This latter topic shows how the market identifies and eliminates any risk, less 
arbitrage opportunities. 

Trading speculative markets has always been difficult. Over the years, several stud- 
ies have shown that some 70 to 80 percent of all mutual funds underperform the aver- 
ages. A study by Professor Terrance Odean of the University of California at Berkeley 
demonstrated that most individual investors actually lose money. This study analyzed 
thousands of real-life individual investor brokerage accounts. Thus, it provides a compre- 
hensive look at how real individual traders operate. The inescapable conclusion is that 
both professional and individual investors find that trading the markets is challenging. 

Successful trading is predicated on one thing. Traders must predict the direction 
of price changes in the future. At a minimum, a successful trader must predict prices so 
that each trade has an expectation of yielding a profit. This does not mean that each trade 
must be successful, but, rather, that a succession of trades would usually be expected to 
result in a profit. This should not be taken to mean that having a positive expectation for 
each trade is the only thing a successful trader needs. The astute reader will note that 
the use of words such as usually, average, and expectation naturally implies that the art 
of forecasting is far from perfect. In fact, it is best studied from a statistical perspective 
with a view to identifying what is random and what is predictable. 

Inarecent 500-day period, the stock market as measured by the Standard and Poor's 
500 index was generally a modestly up market. A statistical analysis of the daily com- 
pounded returns for the period shows: 


Average daily return: .038 percent 
Standard deviation: .640 percent 
Probability of rise: 56 percent 


Modeling Market Microstructure—Randomness in Markets 3 


The standard deviation is simply a measure of the variability of returns around the 
average. From this simple analysis, we can make some interesting observations: 


1. The average daily return is small with respect to the standard deviation. 
2. The daily variability is relatively large, at 16 times the return. 
3. The market went up 56 percent of the time, or slightly more than half. It also went 


down the other 44 percent of the days. So even during up markets, the number of up 
days is only slightly better than 50-50. 


4. The variability completely swamps the average return. 


Observations such as these have led many early researchers in finance to propose 
a model for the markets that explicitly embraces randomness at its very core. A corner- 
stone of this idea is that markets represent all of the knowledge, information, and in- 
telligent analysis that the many participants bring to bear. Thus, the market has already 
priced itself to correspond with the sum of all human knowledge. In order to outperform 
the market, a trader must have better information or analysis than the rest of the partic- 
ipants collectively. It would seem the successful trader must be smarter than everyone 
else in the world put together. 


To the typical layman, the random walk model is the best-known name for the idea that 
markets are very good at pricing themselves so as to remove excess profit opportunities. 
The academic community generally prefers the description the efficient market hypothe- 
sis (EMH). Either way, the idea is the same—it is very difficult to outperform the market. 
If someone does outperform, then it is likely only attributable to mere luck and not skill. 

The history of the EMH is a rather long one. The first known work was by Louis 
Bachelier in 1900, in which he posited a normal distribution of price changes and de- 
veloped the first-known option model based on the idea of a normal random walk (see 
Figure 1.1). His seminal paper in the field was quickly forgotten for some 60 years. As an 
interesting side note, the mathematics that Bachelier developed was essentially the same 
analysis that Albert Einstein reinvented in 1906 in his study of Brownian motion of micro- 
scopic particles. Einstein’s famous paper was published some six years after Bachelier’s 
work. However Bachelier’s paper languished in relative obscurity until its rediscovery in 
the 1960s. 

Prof. Paul Samuelson of the Massachusetts Institute of Technology offered a Proof 
that Properly Anticipated Prices Fluctuate Randomly in the 1960s. This provided a 
theoretical basis for the EMH idea. However, it fell to M. F. M. Osborne to provide the 


4 OPTIMAL PORTFOLIO MODELING 


0.45 


034 


025 


function(x) dnorm(x) (x) 


014 


00+ 


3 -2 -1 0 ‘f 2 3 


Figure 1.4 Normal Probability Distribution 


modern theoretical basis for the efficient market hypothesis. Osborne was the first to 
posit the idea of a lognormal distribution and provide evidence that the price changes 
in the market were log normally distributed. Furthermore, he was the first modern re- 
searcher to draw the link between the fluctuations of the market and the mathematics of 
random walks developed by Bachelier and Einstein decades earlier. 

Osborne was a physicist by training employed at the U.S. Naval Observatory. As 
such, he was not an academic, nor did he come from a traditional finance background. 
Thus, it is not surprising that he is rarely recognized as the father of the efficient market 
hypothesis in the lognormal form. However, it is very clear that his empirical and the- 
oretical work that: described the distribution of stock price changes as log normal and 
the underlying process of the market as being akin to the process described by Einstein 
called Brownian motion was the first to elucidate both concepts. Osborne deserves the 
honor of being the father of the EMH. 

As so often happens in academia, others who published later and were fully aware of 
Osborne's work have received much of the credit. Statistician and student of mathemati- 
cal and statistical history, Stephen M. Stigler has whimsically called the phenomenon his 
law of eponymy. The wrong person is invariably credited with any given discovery. 

One aspect of this phenomenon is that when a person is erroncously credited with 
a discovery for whatever reason, his or her name is attached to that, discovery. After 
much widespread usage, the name tends to stick. So even when it is later discovered by 


Modeling Market Microstructure—Randomness in Markets 5 


historians that someone else actually discovered the idea first, it is usually just treated as 
a footnote and rarely adopted into common usage among practitioners in the field. 

Such is the case for Osborne’s contribution to the efficient market hypothesis. It was 
partly because he was a physicist working in the field of astronomy. At the time of his 
publication, he was not really an accepted name in the field of finance. 

One form of the EMH defines the relationship between today’s price X; and tomor- 
row’s price X;,; as follows: 


Xi =X te (1) 


where e is a random error term. We note that this model is inherently an additive model. 
The usual academic assumption corresponding to this type of model is the normal distri- 
bution. The key concept is that the normal distribution is strongly associated with sums 
of random variables. In fact, there is a weak convergence theorem in probability theory 
that states that for any sums of independent identically distributed variables with finite 
variance, their distribution will converge to the normal distribution. This result virtually 
assures us that the normal distribution will remain ubiquitous in nature. 

However, the empirical work of Osborne showed us that the distribution of price 
changes was log normal. This type of distribution is consistent with a multiplicative 
model of price changes. In this model, the expression for price changes becomes 


Xi =X +6) (2) 


Some would argue that the market is not random. Certainly, almost every single par- 
ticipant in the market believes he or she will achieve superior results. Most of these 
participants are smarter, richer, and better educated than average. Can they all achieve 
superior returns? Of course, it would be mathematically impossible for everyone to be 
above average. Can they all be deluded? 

To answer this question it is helpful to look at the long-term history of the market. 
When we fit a regression line through the monthly Standard & Poor's closing prices P; on 
the first trading day of each month since 1950 until November 2006, we find the following: 


In P, = .0059¢ + 3.06 


In this case the ¢ values are simply month numbers starting at 1, then 2, and so on 
for each of the 683 months in the study. The fitted coefficient .0059 can be interpreted 
as a simple monthly rate of increase in the series. So if we annualize, we get an annual 
rate of return of about 7.1 percent for the long-term growth rate of the Standard & Poor's 
500 average (see Figure 1.2). This is a very respectable long-term upward trend in the 


6 OPTIMAL PORTFOLIO MODELING 


* In(SnP) 
= Predicted in(SnP) 


In(SnP) 


500 
Time 


Figure 1.2 Log S&P Regression as Function of Time 


market. The R° for this regression was 97 percent. Given that 100 percent is a perfect fit, 
this indicates that the model is a very good one. ; 

The underlying message here is that the market goes up over time. The fact that 
the natural log model fits well tells us that the growth in the market is compounded 
and presumably derived from a multiplicative model. But beyond that, it tends to make 
people think they are financial geniuses who might not be. 


Bull markets make us all geniuses. 
—Wall Street: maxim 


From the perspective of the long-term time frame, the market has been in a bull 
phase for at least the entire last century. Human beings have a natural propensity to 
attribute good luck to their own innate skill. Psychologists call this the self-attribution 
Jallacy. The long-term bull market has created a large group of investors who believe 
they have some superior gift for investing. Few investors ever stop to critically analyze 
their own results to verify that they are indeed performing better than the market. 

Given that the market exhibits long-term compounded returns over time, it is clear 
the best model is a multiplicative one. This long-term retum is often called the drift—the 
tendency of the market to move inexorably upward over time. However, to understand 
the shorter-term movements of the market, we must: look to a different kind of model 
in which the short-term fluctuations appear to be more random. The reason for that is 
simply because the marketplace in general will anticipate all known information, and 
thus, the current market price is the best price available. Thus, by definition, any news 
that is material to the market and was not anticipated will appear as random shocks in 
either direction. 


Market Microstructure—Randomness in Markets 7 


The key idea to understand is that the market will not respond to news that it already 
= Or if it does, that response will be contrary to what a rational analyst might have 
d. These contrary movements are caused when a large group of investors was 
a certain piece of news and thus, holding positions that were previously taken. 
= the news is announced, the entire group may try to unwind their positions, result- 
a market movement in exactly the opposite direction one might expect. Simply 
the market has already discounted the expected news and adjusted the price well in 
e. Because this phenomenon is so prevalent, Wall Street has evolved the maxim, 
on the rumor, sell on the news.” Although one would never recommend relying on 
°s for investment success, certainly buying on the correct anticipation of news is 
‘better strategy. 
‘This leaves us with the realization that, absent informed knowledge of upcoming 
the outcome of such events will be random and unpredictable to us. Some would 
that for most news someone knew the event in advance. Certainly for earnings an- 
cements and government reports, someone did know the information to a certainty. 
em, the news was not random but completely predictable. Assuming the informa- 
was not widely disclosed, then for the rest of investors, the information remains 
om and unpredictable. 
There is a general principle at work here. Jf we cannot predict the news, then it is 
to us. So even if others know the information, then insofar as we do not, and 
ot predict it, it remains random for us. 


snerally speaking, the market consists of the interactions between four broad classes 
# orders. These can be grouped into two categories each. There are market orders and 
are limit orders. There are orders to buy and sell. Although there are variations and 
ees on each, these characterize the main categories of trading orders. 


_* Market order—A market order is an order to buy or sell that is to be executed imme- 

_ diately at the best available price 

» Limit order—This is an order to buy or sell that is only to be executed at the speci- 
fied limit price or better. Limit orders may have an expiration, such as the end of the 


day or 60 days. 


The quote at any given time is essentially based on the best limit order to buy, which 
‘mown as the bid, and the best limit order to sell, which is the ask. When a market 
emer to sell comes in, it is usually crossed with the bid. Therefore, we can expect the 
ee of a market order to sell to be the bid price. We should note that market orders are 


8 OPTIMAL PORTFOLIO MODELING 


usually smaller in size than limit orders. Usually, this means that the market order will 
be executed at the bid and that the remaining size of the limit order(s) at the bid will be 
reduced by the amount of the market order. In effect, market orders nibble away at the 
larger limit orders. It is only after enough market orders have consumed the bid that the 
bid—ask quote will drop to a new lower bid. 

The other side of this process is when a market order to buy, say, 200 shares comes 
in. Assume there are 1,000 shares for sale at the ask price of 50.10. In this case, the 
market order will be crossed with the limit order, resulting in a transaction of 200 shares 
ata price of 50.10. After the transaction, the ask side limit order will show the remaining 
800 shares offered at 50.10. Itis only after the 800 shares have been consumed by market 
orders that the ask price will move higher. 

Because limit orders tend to persist longer than market: orders and are larger than 
market orders, there is a tendency for the last sale price to alternate back and forth 
between the bid and ask until one or the other price barrier is consumed. Only then does 
the quote move. For example, when the ask price is extinguished, the ask will move to 
anew higher price—the next limit order up. Quite often, the old bid will be superceded 
by a slightly higher bid, either from a market maker or an off-floor limit order. Thus, 
the entire quote has a tendency to move up. To really understand the current market 
situation, one must really look beyond simply the last sale and consider the current bid 
and ask and the relative size of the bid and ask. 

Another important aspect of this market microstructure can be understood in the 
sense of news. We can view the arrival of market orders as news of an investor’s decision 
process. In some cases, orders to sell may simply indicate a need for liquidity. It may be 
as simple as Aunt Mabel in Peoria sold 100 shares to raise money to buy videogames 
for all her nieces and nephews this Christmas. Alternatively, the sale may mean that 
an investor's views on the prospects for the company have changed. This is certainly a 
different kind of information, but every trade contains information. 

The other side of the coin is that predicting and modeling the market at the mi- 
crostructure level is very difficult. We do not even know who Aunt Mabel is, much less 
her plans and how many nieces and nephews she has. Thus, her sale of stock for liquid- 
ity needs is unpredictable for us. Therefore, any price change it causes is also random to 
us. So to model this sort of environment we must explicitly allow for a large degree of 
randomness in the short-term market movements. 

The astute reader may have wondered if all market orders are always crossed with 
limit orders. The answer, of course, is no. Most market orders are crossed with limit 
orders for the reasons already mentioned, but certainly it is possible for two market 
orders to arrive at essentially the same time and be crossed with each other. By the same 
token, it is possible for an aggressive trader to place a large limit order to buy at the ask 
price or to sell at the bid price. In this case, a limit order will be crossed with a limit 
order. 


‘Woedeling Market Microstructure—Randomness in Markets 9 


There are also stop orders and other contingencies that can be placed on orders. 
vever, for the most part, such as when the stop price is hit, the order becomes a valid 
et order. Alternatively, for a stop limit order it becomes a valid limit order. Thus, 
‘Se four-order model just described adequately covers the vast majority of the cases. 

It is also worth noting that the market makers effectively act as though they were 
limit orders. Sometimes the bid and ask will both be from a market maker. At 
er times, one or the other may be an off floor limit order. Nevertheless, the market 
<ers seek to profit from the tendency for the market to trade back and for the several 
es between the bid and ask prices before moving either higher and lower. This market- 
ker strategy does not always work, but it works well enough that market makers tend 
make a very good profit. A very telling fact is that New York Stock Exchange seats 
ve routinely sold for millions of dollars for quite awhile now. 


are now ready to formalize our efficient market hypothesis as a mathematical model. 
= general principle is that it is a multiplicative model wherein the near-term price 
ges are swamped by the short-term variability. 

Thus, a good statement of the model is the form expressed earlier: 


Xi1 = Xi(1 +) (3) 


Here, we have the price today X; related to the price tomorrow by a simple random 
itiplicative term (1 + e), where ¢ is the random variable. In Chapter 2, we will discuss 
nature of the random variable e to better understand the structure of the market. 

We noie that equation 1.3 is the multiplicative form analogous to equation 1.2. This is 
contrast to the additive model, which is given by equation 1.1. The multiplicative form 
consistent with the log normal distribution put forward by Osborne. 

Later in this book, we shall show how the multiplicative and lognormal models are 
9 the most appropriate in order to deal with the compound interest effect known to 
in the equity markets. This forms the foundation for the ideas developed later in 
book, which allow an investor to maximize long-term compounded returns on the 
folio. 

One of the author's favorite apocryphal stories is that of the finance professor, the 
omist, and the nimble trader. 

One day, a finance professor, who firmly believed in the Efficient Market Hypothesis, 

S walking along the street. He spotted a one hundred dollar bill lying on the ground. 

paused, realized that in an efficient market no one would leave hundred dollar bills 

g around. He continued on his walk, confident that it was only a trick of the light. 


10 OPTIMAL PORTFOLIO MODELING 


Minutes later an economist strolled by and saw the hundred dollar bill. He began 
to calculate to see if picking up the hundred dollar bill would improve his utility of 
wealth for the day. While he was still calculating, a quick-stepping trader walked past 
him, picked up the hundred dollar bill and hastily continued on down the street. 

The next section has much to do with quick-footed traders 


ARBITRAGE PRICING THEORY 


A close cousin of the EMH is a theory called arbitrage pricing theory (APT). Essentially, 
this says that the market will not allow any riskless arbitrage to exist. A simple example 
of riskless arbitrage is if IBM is selling at $80 per share on the New York Stock Exchange 
and sells for 79.90 on the Pacific Stock Exchange. A nimble trader can buy shares at 
79.90 on the Pacific and sell them for 80 in New York for a quick profit of 10. This trade 
is essentially riskless if done simultaneously. 

Arbitrage pricing theory mandates that such opportunities should not exist, or that if 
they do, they will be quickly extinguished to the point that they are no longer profitable 
after expenses. It is easy to see why this should happen. In the case of our arbitrage 
trader in IBM shares when he buys at 79.90, his buying will tend to increase the price on 
the Pacific Exchange. When he sells in New York, his selling will tend to drive the price 
there down. Thus, the two prices will quickly come into line and the arbitrage opportunity 
will be extinguished. 

The ideas of APT have developed largely through the efforts of Stephen Ross and 
Fisher Black. A broader version of these ideas is the concept that. one can arbitrage 
expectations as well as simple price. So rather than just focusing on price differential, the 
term can inchide cross relationships between different assets connected via a common 
factor. 

Suppose the price of oil has risen. Then it might be reasonable to believe that the 
expectation for the earnings of companies that sell oil would be enhanced as well, They 
are now able to sell at a higher price. Thus, our expectation for the price of oil stocks is 
now enhanced and we would buy. 

Such buying, if done by many, would tend to force the prices of oil stocks up in 
response to the rise in the oil commodity itself. It is an example of how one factor can 
drive many stocks. However, the same factor can have a negative impact on other stocks. 

An example of this is obvious as well. Again, assume the price of oil has risen as 
before. If we consider the impact of this fact on automobile companies, we quickly realize 
that the impact can only be negative. The effect may vary from company to company, but 
itis negative, Itnow costs more to fuel your car and consumers are less likely to purchase 
new cars or extra cars. 


g Market Microstructure—Randomness in Markets il 


‘Companies that are heavily into gas-guzzling SUVs will be hurt the mosi. Consumers 
the strongest disincentive with respect to these vehicles. It is much easier for them 
efer or cancel any new purchase. However, companies that are strong in the econom- 
ar submarkets or in fuel-saving hybrids will likely benefit, relatively speaking. 
‘Ome might suppose that with the advent and ubiquity of modern computing power, 
arbitrage opportunities would vanish. However, it is also the case that there has 
am enormous rise in derivative instruments in the last few decades as well. It is 
‘entirely possible to buy a basket of stocks representing some index and to trade a 
contract on the index and to trade an exchange traded fund on the index. The 
er of arbitrage opportunities increases with the number of combinations of instru- 
available. So when we add multiple futures contracts to the mix, we have many 
combinations. But the real arbitrage opportunities are in the large number of op- 
both puts and calls, at multiple strike prices and various expiration months. On any 
day. the markets will trade over 50,000 equity distinct options. The number of listed 
is well into the hundreds of thousands. The number of arbitrage combinations of 
three, or more options on all these stocks is well into the millions. 
‘Thas, even with today’s computing power it is still rather difficult for the market 
imate all arbitrage opportunities. In fact, the market does a remarkably good job, 
the large number of such arbitrages available. The bottom line is that the 
e pricing theory is a pretty good model for market efficiency, but not necessarily 
Sct one. 
this point the user is encouraged to begin to explore the CD-ROM that came with 
Each chapter of this work has corresponding examples on the CD-ROM that 
to the topics developed in the chapter. Although using the CD is not required, it 
© recommended as a way to bring the chapter contents to life. The programs and 
provided are generally intended to be as simple as possible and focus only on 
lar topic presented in the text. 
that in mind, the user should find it a very worthwhile exercise at the end of 
chapter to take a break and review the examples for that chapter. The exercise 
take only afew minutes in most cases, but the hands-on experience should prove 
= in enabling readers to get a feel for the subjects covered. 

reader should start exploring the CD-ROM by reading the appendix, About the 


APTER 2 


order to discuss the idea of randomness in the markets, it is very helpful to under- 
some basic statistics. This chapter introduces the essential ideas of that disci- 
from an intuitive conceptual standpoint. The discussion begins with a definition 
a probability distribution is and proceeds to discuss the well-known normal and 
distributions as they relate to the markets. A little of the history of these dis- 
is provided as well. 

&eeping with the practical nature of this book, a handy formula is provided that 
‘= to approximate the normal distribution with a rational polynomial. Do not be 
if you are unfamiliar with rational polynomials. It is just a formula, and your 
knows how to do those very well. 

“Ope of the essential topics of this chapter is the discussion of the reflection principle. 
is based on the symmetry and self-similarity of the normal distribution. This 
should not be skipped, for it is the foundation of several of the ideas presented 


FM. Osbome first studied the market, he examined the changes in price for 
Stock Exchange stocks. His research considered both the actual price levels 
for the stocks in question. His work was the first to scientifically evaluate the 
of price levels and of changes. This is a good starting point for any research 
of market prices. 


13 


14 OPTIMAL PORTFOLIO MODELING 


First we should consider the definition of the word distribution. To a statistician, 
the probability distribution is the probability that the given random variable will be at 
a certain value level for a given observation. In other words, a distribution associates a 
range of values and their respective probabilities of occurrence. It is not just one num- 
ber. For many distributions there is a known formula to calculate the probability that a 
given observation will be at level x. To calculate the probability, we simply plug x into 
the formula and calculate. However, for many other distributions there is no known for- 
mula or the distribution itself is unknown so the convenience of a closed formula is not 
available. 

Most books on probability and statistics start with a distribution known as the 
binomial distribution. It can be used to model a simple game of coin flipping. For our 
purposes, the probability of a head will be assumed to be 50 percent as is the probability 
for tails. If you play this game one time and bet $1 on heads, then the outcome will be +1 
half the time and —1 the other half of the time. After only two coin flips, the outcomes 
and their probabilities are as follows: 


Outcome 
2 
co} 
+2 


Note that the outcome of zero or breakeven is the most likely event after only two 
flips. The more extreme outcomes of plus or minus two are less likely. So even after two 
flips we see that the outcomes tend to pile up in the middle. The reason for this effect is 
clear if we examine the four possible paths in getting to these outcomes: 


-+ 
ne 
++ 


There are only these four paths possible, and each is equally likely. An inspection of 
the paths —+ and +— shows that these two will both result in the zero outcome. Even 
though the order is different, the outcome is the same. However, to arrive at plus or 
minus two, there is only one available path. Both must lose or both must win. Intuitively, 
we can see that the number of paths to arrive at the center is greater than the number of 
paths to arrive at extreme values. This principle will generalize to more coin flips. It is 
easier for a random process to arrive near the center than it is to arrive near an extreme 
value simply because there are more paths to the center and fewer to the extremes. 


of Price Changes 15 


property to note in this discussion is that the distribution of outcomes 
ic about the midpoint. This results partly from the fact that the game out- 
are symmetric. At each stage the participant is equally likely to win equally large 
Tris also the case that each stage is composed of another similar bet. Thus, this 
over many outcomes will be self-similar. 
4s well known in statistics that the binomial distribution will eventually converge 
normal distribution. The mathematical proof of this convergence will not concern 
Rather, we shall focus on the underlying intuition of the result and try to gain 
ding of the properties of the normal distribution from that. The important 
4s. that if we play the game long enough, the binomial will eventually result in a 
‘ion of outcomes that is very close to the normal distribution. 
Figure 2.1 we see a graph of the probability distribution for the normal distribu- 
The obvious bell shape of the curve naturally inspired the nickname bell-shaped 
for this distribution. 
immediately note three properties: 


height of the curve represents the probability of that event occurring. The cen- 
of the curve is the highest. This means that the greatest probability is for the next 
‘e2ndom outcome from a normal distribution to come from the center. 


044 


0.34 


024 


funvetlon(x) droren(x) (x) 


o14 


oo 


2.1 Normal Probability Distribution 


16 OPTIMAL PORTFOLIO MODELING 


2. The extremes of the distribution are the least likely. It is less likely that the next 
outcome will come from the extremes, 


3. The distribution is symmetric. The left half is the mirror image of the right half. 


From our discussion of the simple binomial distribution, we saw that the binomial 
is self-similar in the sense that each added outcome looks like the previous ones but 
adds more paths. Knowing that the binomial converges to the normal distribution given 
large enough sample size, then we have an intuitive basis to understand that the nor- 
mal distribution is self-similar as well. The reason is the same. At each added stage the 
normal adds more similar paths to the resulting outcomes. Thus, we have the important 
result that any given normal distribution is simply the outcome of a succession of similar 
normal distributions. 

The cumulative density function ( cdf) for the normal distribution is given by 
Figure 2.2, The cumulative distribution function is obtained by starting at the left side 


0.8 4 


ft 
& 
L 


bd 
5 
1 


function(x) pnorm(x) (x) 


0.25 


0.04 


Figure 2.2 Normal Cumulative Density Function 


of Price Changes 17 


smal bell-shaped curve and adding up each probability. This is simply the calcu- 

of adding up the area under the probability curve. Thus, the cdf always starts 

4 moves monotonically upward to the right to a final value of one. The fact that 

is one simply means that the probability is 100 percent that a given value from 
distribution will lie somewhere on that curve. 


milar symmetry of the normal distribution results in another well-known and 
property of the normal distribution. It is the reflection principle. 


principle—The normal distribution is a mirror image of ttself- 
sted through its central axis. 


mean is equal to the median for a normal distribution. Because the mean of 
distribution is the central axis, it is also true that the normal distribution is 
through the mean and median as well. 

gh there are several proofs of the reflection principle in statistics that involve 
mathematics, the result has been proved using elementary concepts as well. In 
fashion, the reflection principle can be used to prove concepts in a very elegant 
© manner. The important concept here is that the reflection principle arises 
ilar, symmetric distribution and specifically applies to the normal distri- 
_ We shall use the reflection principle later to prove several important modeling 
that are of general importance to investors. 

should consider that the normal distribution naturally arises in conjunction with 
Sve model for market prices. Thus, the appropriate model is of this form: 


Xun = Xi +e @1) 


is the price at time ¢, and ¢ is a normally distributed random innovation. Here 
imnovation simply implies new information that presumably contributed to the 
imprice. 

should also note that the normal distribution is completely characterized if 
= the mean and standard deviation for the distribution. The usual notation to 
¥ denote a normal distribution is something like: 


N(u,0) 


sis the mean and @ is the standard deviation. 


18 OPTIMAL PORTFOLIO MODELING 


The probability function for the normal distribution is given by: 


p(a) = exp[-(«— u)*/(20°)] (2.2) 


1 
over 
Equation 2.2 gives us the probability that a random variable drawn from a normal 
distribution with mean w and standard deviation o will take on value «. In essence, it 
defines the height of the bell curve at any given point. 
The complete formula for the cumulative probability of the standardized normal dis- 
tribution is 
oy : 
(2) = —= | eV" az (2.3) 
px i 


where z is a standardized transform of the random variable x given by: 


=(e@-w/o. 


We note in passing that the terms of the formula are greatly simplified by using the 
transformed z variable. Thus, the mean and standard deviation terms need not appear 
because they are zero and one, respectively. 

Although the formula may look intimidating to some, in fact one rarely needs to deal 
with the actual formula for practical modeling. Most computing platforms that one might 
use for financial modeling include simple methods to calculate the probabilities of the 
normal distribution. In particular, the open source statistical language R and Excel both 
include such library functions, 


APPROXIMATION OF THE NORMAL DISTRIBUTION BY 
RATIONAL POLYNOMIAL 


For situations in which the computing platform does not offer a method to compute 
the normal distribution, one can use the following rational polynomial to evaluate the 
function for any given x: 


N(@) = 1= p(a)(ak + bk? + ck® +k! + fk) for 2>0 (2.4) 


of Price Changes 19 


&=1/(1 + .23164192) 
JJ 2x e~? ? 


‘constants are 


319381530 
356563782 
.781477937 
821255978 
330274429 


that this formula applies to a standardized variable from an N(0, 1) distribu- 
his means that the variable has been transformed by subtracting the mean of the 
and dividing by the standard deviation. Equation 2.4 applies only to values 
‘that are greater than or equal to zero. However, by the reflection principle we can. 
formula for negative values of x simply by taking it as: 


N(w) =1-—N(e) for x<0 (2.5) 


we discussed the idea that the lognormal distribution was the more appropriate 
for modeling stock prices. This result springs directly from the fact that the mar- 
a continuously compounded return that appears to be relatively constant over 
of time. Thus, the underlying time series model is a multiplicative one of the 


Pui PPG +P +e) (2.6) 


P, is the price at time ¢, 7 is the rate of return, and ¢ is the random variation 
the mean r. Here e is drawn from N(0, a) because we have already extracted the 
+r explicitly. 

above formula could also be expressed as 


Pui/P =U +r+e) 27 


In Pus /P, = n(1 +r +e) (2.8) 


20 OPTIMAL PORTFOLIO MODELING 


From the latter form, we see that on the right-hand side, the | + r part is constant for 
all t. Thus, at each time ¢, the entire random variation is due to the e term. The systematic 
variation is associated with the 7 term. 

A lognormal distribution is a distribution of a variable X whose log is normally 
distributed. Generally speaking, sums of independent identically distributed random 
variables tend to converge to a normal distribution. In a similar fashion, products of 
independent randomly distributed variables tend to converge to a lognormal distribu- 
tion. The phrase independent identically distributed is such an important concept in 
statistics that it is often abbreviated as IID. 

All of the preceding properties of the normal distribution hold with respect to the 
Jognormal if we look at the properties from the perspective of the logarithmic transfor- 
mation. Essentially, this means nothing more than if we graph the lognormal distribution 
on an arithmetic scale, it appears to have a skewed (long) right tail such as in Figure 2.3. 
But if we graph it using a log scale, the graph appears to be indistinguishable from the 
normal distribution. Therefore, considered from the perspective of a log scale, the log- 
normal follows the properties of the normal as well. 


=0.3) (x) 
i 
hi 


0, sdlog 
6 
i 


08+ 


0.64 


0.44 


0.24 


function(x) dinorm(x, meanlog 


0.04 


0.0 05 1.0 Sh 2.0 


Figure 2.3 Log Normal Probability Distribution 


of Price Changes 24 


ms can take several forms. Most of us are familiar with logs base 10 in which 
ef 10 is 1, log of 100 is 2, and so on. These are common for scientific notation and 
. uses. For computers, communications, and information theory, the most 
is base 2. In each of these fields, the basic unit of information is the bit that 
ly two values—zero or one. These values naturally correspond to the on or 
an electronic switch. 
e, the natural log base is base e. This is the base used for the math of contin- 
ling. Here the base e is Euler’s constant ¢, which is approximately equal 
.. Most scientific calculators have this built in as a standard function. 
me will see ¢ raised to a power. Also, ¢ to a power is sometimes written as the 
“tion. The exp notation is very common in computing languages. From this we 


owing: 
e* = exp(2) (2.9) 


eto the power of x is the same as exp(q) (see Figure 2.4). 


e ° 2 xg 
Es Ey ® o 
L 1 1 L 


function(x) plnorm(x, meanlog = 0, sdlog = 0.3) (x) 
8 
1 


0.04 


T T T T 
0.5 1.0 15 2.0 
x 


Log Normal Cumulative Density Function 


22 OPTIMAL PORTFOLIO MODELING: 


SYMMETRY OF THE NORMAL AND LOGNORMAL 


From this discussion we can now summarize the symmetry properties of the normal and 
lognormal distributions. For this purpose, we should remember that we will consider the 
symmetry of the lognormal only in the transformed log space. 

Many people have argued that the distribution of stock market prices is not strictly 
lognormal. On the face of it, there would appear to be some truth to this assertion. Cen- 
tral to the arguments against the lognormal is the fact that there appears to be a slight 
discrepancy in the tails of the distribution. The tails of the distribution are defined as the 
outlying edges—the low-probability extreme areas of the distribution. What researchers 
have found is that there are a few too many observations in the tails to be perfectly con- 
sistent with the normal or lognormal models. B. Mandelbrot has even argued that the 
variance of price changes in speculative markets is infinite. This author has studied the 
complete price history of the top 6,000 stocks currently traded and has failed to find a 
single instance of an infinite price. Absent even a single observation of an infinite price 
actually being recorded, we must look elsewhere for a reasonable explanation of the fat 
tails phenomenon. 

Fortunately, there is a better explanation available. During the 1990s, a number of 
researchers developed a branch of time series analysis that looked at the variance and 
standard deviation of price changes as a function of time. The studies started with a 
model called GARCH (generalized auto regressive conditional heteroskedasticity). At 
the core of the model is the concept of auto regression (AR). Auto regressive models 
attempt to capture any innate serial correlation in the time series in order to predict the 
future values of the series. Such models assume a process of the form 


Xiy1 = Xt +--+ aKa +e (2.10) 


Where X; is the realization of the time series at time ¢ and ¢ is the random innovation 
to arrive at X;,;. GARCH models expand on this concept by focusing on and trying to 
model the variance of the price series. The word heteroskedasticity simply means differ- 
ent variances. In other words, the GARCH models do not assume a constant variance, 
but rather, an ever-changing one that can be modeled by identifying regime changes. That 
is the meaning of the word conditional in the GARCH acronym. 

The GARCH models have proved quite successful at modeling the changing cycles in 
the variance of stock prices, as well as other markets. But perhaps the most interesting 
observation is that these models can be shown both mathematically and empirically to 
predict the fat tail phenomenon with considerable accuracy. Thus, we have a model that 
predicts the variance with reasonable accuracy, as well as provides a theoretical basis 
for the observed distribution of price changes. This stands in stark contrast to those who 
claim that a few too many outliers can only mean the variance is infinite. 


of Price Changes 


entally, the claim of infinite variance is fatally pessimistic. Assuming infinite 
obviates any sort, of statistical testing because one can never trust the results. 
variance dictates that no predictions would ever be possible, simply because any 
“son would likely be in error by an infinite amount. Needless to say, such a theory 
estable and inherently incapable of being falsified. In order to be validated by the 
method, a theory must be both testable and capable of being falsified. Theories 
ume an infinite variance as an axiom cannot be tested, nor can they be falsified. 
aeevare not amenable to the scientific method. It is fair to say that they are not even 
theories at all, but really mere philosophical arguments or just raw opinions. 
ately, the GARCH models and their close kin, the EGARCH and other models, 
a perfectly acceptable alternative. More importantly, they provide a sound and 
explanation for the fat tails. So from the practical point of view of the investor 
© professional, these models are ideal and offer the added benefit of the ability 
et the volatility as well. Another often-overlooked aspect is the fact that the em- 
Gstribution of stock price changes, and those of most other markets as well, also 
© 2 larger-than-expected number of small price changes. The GARCH models also 
ed this phenomenon well as a low-volatility regime. 


early days of finance, computing power was expensive, memory was at a pre- 
and data was expensive. It: was incumbent upon all researchers and practitioners 
erve computer resources. Today, the equivalent of a super computer sits on ev- 
*s desktop. It is no longer unthinkable to analyze the detailed numbers from the 
distribution of price changes. Thus, the computational obstacle has been forever 
ed. But there is still a very important place for studying the markets as a lognor- 
cribution with perhaps the variance changing regimes of a GARCH model added 
sate a better fit between the empirical data and the theoretical concept. Study- 

distributions can yield insights and formulas that would be impossible to see 
merely studying the data. By studying these well-known distributions, we can gain 
tanding. 


The purpose of computing is insight, not numbers. 
—Richard W. Hamming, one of the greatest mathematicians 
of the twentieth century 


Se we can gain an intuitive understanding from the theoretical distributions and 
insights not obtainable elsewhere, but we should also be aware that the 
fical distribution may tell a different story. Each is important in its own way. The 


24 OPTIMAL PORTFOLIO MODELING 


theoretical can be used for insight and intuition where appropriate but should always be 
cross checked against the empirical as a kind of reality check. 


THE EMPIRICAL DISTRIBUTION 


The empirical distribution is simply the distribution of price changes that actually oc- 
curred in the market. A conditional distribution is the distribution that actually resulted 
after a certain condition or event was observed. Suppose the condition that we pro- 
pose to study is, “The market makes a new high.” Our trading system should study the 
time after the market makes a new high and simulate what happens when we buy the 
Standard & Poor's 500 stocks. For practical purposes, it is always best to use data from 
actual market trading, as opposed to data from large calculated indices. 

One reason for this is a phenomenon known as bid-ask bounce. When a market is 
rallying, there is a tendency for the last actual price to occur on the ask price. When the 
market is falling, the last reported price will tend to be at the bid. Thus, the short-term 
direction of the market tends to appear slightly exaggerated during sharp moves. It is 
noisy. 

Another reason to focus on heavily traded vehicles is called stale pricing. For an 
average such as the Standard & Poor's 500, not all stocks trade all the time. Some stocks 
may not trade for a half hour or more. Thus, any sharp market movement near the close 
may only be partially reflected in the final index closing price. To that extent, the index 
represents a lagging poor-quality measure of the true level of the underlying stocks. A 
better measure would be to use the S&P futures or the exchange traded fund (ETF) 
called Spyders with ticker symbol SPY. Each of these is more likely to reflect the true 
level of the market at any given time. 

An example might illustrate the need to consider the empirical distribution. Per our 
previous discussion, suppose our goal is to find the conditional empirical distribution 
of SPY price changes after the market makes a new high. We simply make a list of all 
such changes in the SPY the day or week after a new high event and exclude all the 
observations for which the new high condition did not apply. The remaining compiled list 
is our conditional empirical distribution. However, we do not wish to assume a normal 
or lognormal distribution because we have added a condition into the mix. We cannot 
safely assume that the market will behave at new highs in the same manner as it would 
at other times. Thus, choosing the empirical distribution represents the safest choice 
from the standpoint of the researcher. 

To compile a probability density function for our empirical distribution, we simply 
sort the list of values in order. If we had say, 1,000 values, the first 10 would represent the 
top 1 percent of the distribution, the next 10 would be the second percentile, and so on. 


‘of Price Changes 25 


value by rank is the median, midpoint, and fiftieth percentile of the distribu- 
that there are no advanced mathematical formulas for the empirical distribu- 
‘what it is. We are implicitly saying that we do not know what the theoretical 
‘the distribution really is. We accept merely that it seems to repeat—which is to 
over time. When the empirical distribution is used, one is not assuming a 
jegnormal distribution with its underlying process. Rather, we are working in 
that statisticians term nonparametric. In this sense, the expression non- 
means that we are not assuming the parameters of the normal distribution, 
the distribution is normal. 

‘strong case can be made to only look at the empirical distribution. It offers the 
of no prior assumptions other than a stable distribution with finite variability. 
sometimes knowing the properties of the underlying distribution and the math 
can allow us to make powerful leaps that would otherwise not be possible. The 
ion is the most studied and well understood distribution of all. By exten- 
tegnormal is well understood, also. So, before we throw these distributions out 
purposes, we should at least ask how close they are to the real empiri- 
of the market. Figure 2.5 shows a histogram of the empirical distribution 
500 on a monthly basis. 


 Frequency| 


Frequency 


© 2.5 Histogram in S&P Monthly 


26 OPTIMAL PORTFOLIO MODELING 


THE LOGNORMAL AS AN APPROXIMATION 


In order to compare the lognormal to the empirical distribution, we need only graph the 
two side by side. Figure 2.6 and Figure 2.7 show the relationship between the theoretical 
distributions and the empirical distribution for the SPY exchange traded fund that tracks 
the trading of the S&P 500 index. 

These are QQ plots performed in R. The idea of a QQ plot is to match up the quantiles 
of the two distributions that we would like to compare. We can then see where the two 
distributions differ. In particular, we should note where the tails diverge and look for the 
presence of outliers. As a general policy, looking at a QQ plot of one’s data is always 
advisable. In a QQ plot we usually draw a straight line through the central quantile of 
the distribution. It is then easy to see by eye whether the empirical distribution fits the 
expected normal distribution. Deviations from the straight line indicate deviations from 
the expected underlying distribution. 


Riv irlpaee iar aetng 


0.024 ne 


0.014 


0.005 


Sample Quantiles 
6 
Sy 
i 


0.02 2° 


-0.035 


T 
-3 -2 =1 i) 1 2 3 
Theoretical Quantiles 


Figure 2.6 QQ Plot SP vs. Normal 


rion of Price Changes 27 


0.02 


0.01-] 


.02 5 of 


0.03 5 


T T T —— T 
3 -2 1 0 1 2 =} 


Theoretical Quantiles 


2.7 QQ Plot Ln SP vs. Normal 


From Figure 2.6 we see that the normal is rather inadequate to describe the behavior 
markets. The two figures differ in the tails far too much. From Figure 2.7 one might 
ly conclude that the lognormal is an adequate model for the underlying data for 
purposes. However, as has often been noted, the tails of the empirical distribution 
= too fat. In other words, the probability of arriving in one of the tails in the empirical 
ion is greater than that for the lognormal. 

For reference purposes, it is always helpful to compare our results to those achieved 
m. This is often good as a reference point or a reality check. In this case, the 
iate reality check is given by Figure 2.8. In this figure, we see the results of a QQ 
of the normal distribution versus some normally distributed random numbers. The 
seem to show a fairly good agreement with a few minor discrepancies in the tails 

distribution. 

‘Thus, we conclude that for most purposes the lognormal is a good approximation. 
‘the empirical distribution for stock prices. However, it is not perfect. Subsequent 


28 OPTIMAL PORTFOLIO MODELING 


eR TASS ins, | 


° ay 
1 1 


Sample Quantiles 


t 
L 


2 


Theoretical Quantiles 


Figure 2.8 Normal Q-Q Plot 


chapters will deal with both the lognormal and the empirical distribution where appro- 
priate. Both have their place, and we shall identify where each can make a contribution 
to our understanding. In particular, the lognormal has well-understood theoretical and 
mathematical properties that can permit powerful insights into the nature of the market 
process. For prediction and statistical work, the empirical distribution is generally to be 
preferred. We shall give each its due. 


* element in any investment program is defining our investment objectives. At 
jal level, it is clear that investment objectives should include a measure 
a tisk metric, and perhaps some consideration of personal preferences. 
e issues is a difficult topic on its own and worthy of individual discussion. 
important consideration is the correlation between returns on different in- 
When two assets are positively correlated, both tend to be up at the same time 
2 at the same time. The variability of return is increased without necessar- 
added return. When assets are uncorrelated, the returns often tend to cancel 
g variability without decreasing average return. The best scenario for an 
when assets are negatively correlated. When one asset is up, the other tends 
In this case, we obtain the greatest reduction in variability without adversely 
return, because the expected combined return is simply the weighted average 
imvesiments. 


the formula for the expectation of some random variable X is 
n 
Ex 


n 


EX) = GA) 


m parlance, we would call this an average. It is the mean of the distribution 
language. It is calculated as the sum of all the observed instances of X; 


29 


30 OPTIMAL PORTFOLIO MODELING 


divided by the number of such observations. If the random variable in question is the 
return on investment during various periods, the expectation is of critical import to the 
investor. The expectation is the investor's return looking forward. It must be positive and 
better than alternative investments at the same risk level. 


A FAIR GAME IS A LOSER! 


In virtually every introductory statistics book ever written, the presumption is that a 
simple game with an expectation of zero is a fair game. This idea is patently absurd! 
One of the strong themes of this book is that one should never play a game or make an 
investment unless it has a positive expectation. An investor should always have an edge. 


The author's motto: 
Always have an edge. Never let your money leave home without it. 


An expectation of zero means that the player has no expected return but is exposing 
himself to risk—and even the risk of ruin—if the game is played indefinitely with a finite 
bankroll. Certainly, that is a rather bad choice for anyone. No one who is risk averse 
should ever choose to play such a game. 

In all fairness, what the authors of these books really mean by fair is that the game 
is equally bad for both sides. Nevertheless, the correct strategy for each player is not to 
play the game! 


CRITERIA FOR A FAVORABLE GAME 


The question of a fair game naturally raises the question: “What game should we be will- 
ing to play?” Part of the answer lies in the reason why the zero expectation game is 
flawed. In a zero expectation game, we are not compensated for assuming risk. Thus, we 
can conclude that a worthwhile game for us must be one in which there is an expected 
gain. This is because an intelligent and rational investor requires return as compensation 
for bearing risk. 

Next, we should consider available alternatives. If there is a game available to us that 
offers the same risk level but a higher return, we would always want to play that to the 
exclusion of a game with a lower expected return. Thus, any game must be at least as 
good as available alternatives. 

Finally, we should consider each game as a long-run sequence of games. This aspect 
of the analysis is especially relevant in the investment world. We presume there will 
always be stocks, bonds, and derivatives to buy. Investment opportunities are not going 


Objectives 31 


time soon. So this principle requires that we prefer games or investments in 
‘the return from repeated plays is favorable. 

latter point may not seem obvious to most, but is at the heart of the subject 
‘ook. In statistics, there is a well-known principle that the expected sum of n 
identically distributed variables is simply n times the expectation of one 
Hence, if a game has a positive expectation of r for one play, then the expected 
n plays with equal bet sizes is simply » times r. So why is this third criterion 

It would seem redundant. 


to this mystery lies in understanding the idea of gambler’s ruin. Simply ex- 
ed this is the chance of going broke after repeated plays of a game. It depends 
bet size, payoff amounts, and the probability of wins and losses. But more than 
depends as well, on the size of one’s bankroll. This is an important point that is 
erlooked. 
relationship with the bet size and bankroll can be seen intuitively. Suppose we 
traditional fair game of flipping an unbiased coin. Each time, we bet 1 and 
amount equal to our bet. if it is heads and lose our bet if it comes up tails. 
with a bankroll of 10, it takes an initial run of 10 losses in a row to lose our 
The odds against this initial run of bad luck are 1 in 1,024. So it would seem 


what if we played the game 1,024 times? What if we played indefinitely, betting 
sent of our bankroll each time? What are the chances that al some point we would 
a net downswing of only 10 and be ruined? For an investor, these are the 
of life. Investors should be and are in the game for life, whether they realize it or 
s. one critically important goal of investing is to stay in the game. Any strategy 
management scheme that jeopardizes this goal is unacceptable. 
“as consider the same coin-flipping game, but this time we have a bankroll of 
‘Now it takes a net cumulative loss of 1,000 to break us. It is much more likely that. 
play for a long time, and perhaps even indefinitely, simply because we have a 
‘enkroll. From this we see that a larger bankroll reduces the risk of ruin. 
e we change the rules of the game one more time to require that the bet size 
and go back to our original bankroll of 10. Under this scenario, even a single loss 
i will break us, with no chance of recovery. So, we see that our chance of 
e on the first play is 50 percent. 
there are other ways we can bust on subsequent plays. In fact, if our cumulative 
ss ever fall to zero, we are out of the game forever. This shows in an intuitive 
that increasing the bet size relative to our bankroll increases the risk of ruin. 


32 OPTIMAL PORTFOLIO MODELING 


OPTIMAL RETURN MODELS 


From our discussion of iterated plays of a game, it is clear that the investor needs to 
take a long-term view of things. Even though each play itself may seem to be an isolated 
event, it is actually an instance of a series of events that are sequential in time. 

It makes no difference if the investor is an individual or a professional money man- 
ager. For an individual managing his life savings or retirement account, by definition 
his actions are part of a lifetime of such investment choices. Each choice is made in 
sequence. 

For a professional money manager, his investment choices are part of a career-long 
series of such choices. As such, they heavily influence his income bonuses and job secu- 
rity. No matter who you are, the right viewpoint is that of a lifetime series of investment 
decisions. 

The importance of this can be seen no more clearly than in the preceding discussion 
of ruin. If one goes bankrupt first, all subsequent plays are precluded. The sequence 
terminates. So, even if a game has a positive expectation, such as the stock market, that 
expectation is excluded if one becomes ruined. In such a situation, the secret is to stay 
in the game. Any putative positive expectation is rendered meaningless if the investor is 
knocked out of the game. 

Another important consideration is to maximize one’s long-term return from re- 
peated plays of the game. In investments, this means maximizing the compounded rate 
of return. The compounded rate of return depends on the return per period, 

In the simplest model, this return is then reinvested at the presumed same rate, The 
essence is that the total bankroll or account value is rolled forward and reinvested into 
the next period. Thus, returns multiply, not add. Each period’s returns are multiplied 
times the rate of return in the next period. 

The math behind this is simpler than most believe. [f we invest $1 in an investment 
for a year and it returns 10 percent the end-of-year value of the investment is 1.10. After 
two years at the same rate, it is 1.21. The value increases at a multiplicative rate faster 
than simple addition of the return would suggest. Thus, after two years the accumulated 
balance is 1.10 x 1.10 = 1.21. It is not just 1.10 + 1.10 = 1.20, because the reinvestment 
of the first .10 increases the second year’s value to 1.21. 

One way to calculate this is just to multiply all the factors out for as many years as 
needed. Another is to add the natural logarithms of the factors. This function is abbre- 
viated as In(). After we get the total of the logs, we need to calculate the exponential 
function exp() to return the number to its usual format. In order to use logs, we need to 
be sure that the parameters that we input to the log function are in the neighborhood of 
one (1). Another investing term for this is relative return. If one invests in a stock that 
goes up 10 percent by year end, then the relative return is 1.10, not .10. If our stock goes 


Investment Objectives 33 


down 10 percent, then the relative return is .90. Note that both numbers .90 and 1.10 are 
in the neighborhood of one. The formula for the log of the relative return is 


In +7) (8.2) 
where r is the total return. Alternatively, we may say: 
In@Pix1 + D)/Pi) (3.3) 


where P, is the stock price at time ¢ and D is the value of any distributions that. accrued 
to the shareholders during the period. 

Usually this means the value of dividends or stock splits. However, academics have 
counted more than 300 different kinds of distributions that been given to shareholders, 
so the value of each must be included to be truly accurate. Undoubtedly, more exotic 
distributions will be created in the future, so there can be no final and definitive catalog of 
what is included in D. Generally speaking, it is the value of anything that the shareholder 
receives as a result of owning his shares during the time period. 

So, then we have the formula for average compound return: 


Vina + a) 
n 


Avg. compound return = exp ( (3.4) 


It can be understood as the average of the logs converted back to regular numbers by 
using exp(), which is the inverse of the In function. The key thing to understand about 
logs and the exp function is that you add the logs to figure and average log and then 
take the inverse. Adding the logs is just another way to do compound arithmetic without 
doing the multiplications. The exp() is simply the transcendental number e¢ raised to the 
power of what is inside the parentheses. The famous eighteenth-century mathematician 
Leonhard Euler is credited with the discovery of some of the properties of e, and it is 
often called Euler’s constant. The now-standard choice of the letter e stems directly from 
Euler’s last name. 

The important point here is that the average of the logs of the relative returns gives 
us a scientific metric for looking at the value of compounding in a given investment. 
The relationship between the logs and the value of the exp() function is monotonically 
increasing. 

The higher the sum of the logs, the higher the ultimate value will be when the logs 
are converted back to real values. So any model that tries to achieve maximum return 
will need to maximize the sum of the logs of the relative returns for each period. 


34 OPTIMAL PORTFOLIO MODELING 


MARKETS ARE RATIONAL, PSYCHOLOGISTS ARE NOT 


Ina rather famous paper, Professors Daniel Kahnemann and Amos Tversky argued that 
test subjects made irrational and biased choices when presented with gambling game op- 
portunities. The paper was titled “Prospect Theory: Decision Making under Uncertainty” 
and was published in 1979 in Econometrica. Essentially, the methodology of the paper 
was to enlist a group of university-based test subjects that included both faculty and stu- 
dents. Presumably due to simple economics, the students formed the bulk of the test 
subjects, The subjects were presented with simple gambling games and asked to pick 
the one they preferred. The experimenters evaluated the games on a strictly statistical 
expectation basis. Essentially, the study found that in several very interesting situations, 
the test subjects did not choose the game with the best expectation (in the classic but 
flawed statistical sense) but seemed to prefer alternate choices with seemingly worse 
expected gain. 
The criteria used by the experimenters was simply: 


EQ) =o pm (5) 


where the product of the probability p; times the return r;, which is summed over all 
outcomes Z. 

No consideration was given to the relatively impecunious economic profile of the 
students. In fact, net worth or wealth was not even known or considered by the experi- 
menters, as evidenced by the fact that the paper is moot on the subject. 

We saw in the foregoing discussion the importance to investors of maximizing the 
Jog of the relative returns. It would be enlightening to see how the results might differ if 
evaluated from a log return standpoint. The experimenters never considered the possi- 
bility of their subjects having a logarithmic preference for money or seeking to maximize 
their long-run compounded return. Implicitly, the experimenters defined a logarithmic 
preference toward money as irrational. 

In one of the applications in the R language at the end of this chapter, we consider 
the effect of applying a logarithmic utility function to the anomalous problems presented 
in the Kahnemann and Tversky paper. We shall consider Problem 1 here, as well as the 
accompanying R code. 

Subjects were asked to choose between two gambles, A and B. The returns and prob- 
abilities for gamble A are shown in Table 3.1. In gamble A, there were three possible 
outcomes with their associated probability, as indicated in the second column. 

We note that gamble B is a single outcome of 2,400, which is certain, as indicated by 
the probability of 1.00. This is shown in Table 3.2. Gamble A is a .66 probability of the 
2,400 plus two other outcomes. There is a .33 chance to improve the result from 2,400 to 
2,500. The expected value of that improvement is .33 x 100 = 33. There is also the small 


Objectives 35 


Return 


2,500 33 
2,400 66 
te} .01 


Gamble B 


Return Probability 
2,400 1.00 


of .01 to lose the entire 2,400 and wind up at zero. The relative loss of 2,400 x 
us a negative value of —24. Thus, viewed on an arithmetic expectation frame of 
we see that gamble A has an expected value that is 8 more than gamble B. It 
thus rationally be preferred, or so the professors would presumably argue. 
the experiment, the paper reports that an overwhelming 82 percent of the subjects 
gamble B, which has an expected value of 2,400. Only 18 percent chose gamble 
its slightly higher 2,408 expected value. 
order to analyze these choices from a log utility standpoint, we need to make an 
ion about the net worth of the typical student of the era. First, most students only 
-time jobs and most of that money goes to tuition and room and board. There 
teft over for most, in that pretty much all the income is earmarked for future 
expenses. Thus, the typical student of the era probably has just enough left 
beer and pizza on the weekends. For our purposes, we shall assume that the 
wealth of our students is $100 as an arbitrary estimate. 
can now write some simple code that basically uses R as a calculator. Our goal is 
the value of the various gambles from the perspective of a logarithmic utility 
based on our relatively impecunious student subjects and their putative wealth. 
this is essentially a financial exercise, we shall use the natural logs as our base. 
is the R code to calculate the desired arithmetic and logarithmic utilities of the 
gambles offered to the students. 


1: Kahnemann-Tversky Problem A 
< —c(2,500, 2,400, 0) 

< —c(.33, .66, .01) 

<—100 

< —log((za + w)/w) 

sum(pa x u) 


36 OPTIMAL PORTFOLIO MODELING 


TABLE 3.3. Comparison of Kahnemann-Tversky to Log Expectation 


Problem KT Expectation LogExpectation Subjects Choice 
A 2,408 3.20 18 percent 
B 2,400 3.22 82 


In this code, the variable «a is assigned a vector of numerical returns corresponding 
to the returns for gamble A. The vector pa is the probabilities and w is the assumed 
wealth of 100. Then the calculation of the utility u is straightforward: 


u < —log((aa+ w)/w) 


This simply takes the effect of each outcome in vector wa and adds the constant w to 
it and then divides by w to express it as a wealth relative. Then the log of each outcome 
is taken and the results are placed in vector wu. The final line sums the product of the 
probabilities times the utility to give us an expected utility. When the code for Problems 
A and Bis run, we get the comparison shown in Table 3.3. 

Viewed from the standpoint of expected log utility, it is clear that the subjects made 
better choices than the experimenters. The code on the CD-ROM associated with this 
chapter shows five of the problem choices presented in the paper on prospect theory 
that were deemed anomalous. In each case the test subjects chose a gambling choice 
that was inferior to the one dictated by a simple arithmetic expectation theory. Using tra- 
ditional statistical fair game logic the professors naturally deemed these answers wrong 
and therefore irrational. 

However, in each case, the test subjects chose the one that had the highest expected 
Jog utility. Clearly, the simplest, most straightforward interpretation of the data is that 
the test subjects are quite good at evaluating their log utility functions. In contrast, the 
conclusions of the study broke the analysis down to several anomalous cases that each 
required its own divergent, explanation. Together, the complex of these anomalies and ra- 
tionalizations is called prospect theory. For their work Kahnemann and Tversky received 
a Nobel Prize in 2002. 

The author asks, were the students truly irrational because they consistently selected 
optimal logarithmic choices or were the professors the irrational ones? Let the reader 
decide. 


THE ST. PETERSBURG PARADOX 


It would be understandable if modern researchers had overlooked the idea of logarithmic 
utility because it had never been discovered. However, that is not the case. The idea has 
been known for a very long time—more than 200 years! 


Objectives 37 


S ember 1713, Nicolas Bernoulli had first formulated a problem of the following 
<asino pays $1 on the first coin flip that is heads and $2 on the second head and 
es to double the payoff each time a head is tossed. However, if any single tail 
the game is over and the player keeps the amount won up to that point. The 
is, how much should one wager on such a game? 
< expectation analysis we see that. each flip has a 1/2 probability of tuming up 
Se the sequence of probabilities is 1/2, 1/4, 1/8, 1/16 and so on. The expected 
are 1,2,4,8,16,... So the payoff after 1 toss is 1 x 1/2 = .50, after the second it is 
S = 50, and so on. Each toss is worth an added .50, and there are infinitely many 
es possible. So, expectation analysis tells us we should rationally be willing to 
amount. But common sense tells us that is just plain silly. 
sunately, Nicolas Bernoulli had a cousin by the name of Daniel Bernoulli who was 
mathematician and jtician at the University of St. Petersburg in then czarist 
Daniel Bernoulli had already proposed the idea of a decreasing marginal utility 
y. arguing, for example, that twice as much money is not necessarily twice as 
Bemoulli had also recognized the importance of one’s wealth in the analysis. A 
med gain might be a small increment in total wealth to a rich man, but might make 
@ifference in the circumstances of a poor man. Consideration of one’s wealth was 
& important. 
‘hen Bernoulli published his analysis of the S!. Petersburg Paradox, it included 
that money should be valued using the utility function concept. It should come as 
e that the utility function he settled upon was the natural log, or In() function. 
course, has the base e and is the natural base for compound interest calcula- 
finance, as discussed previously. Readers may recall that the constant e was 
and thoroughly analyzed by Leonhard Euler. So perhaps we should not be 
d to learn that Euler and Daniel Bernoulli boarded together while they were 
sociated with the University of St. Petersburg. The results achieved when great 
‘work closely can be astounding. 


preceding discussion, we have seen that compounded growth is governed by 
function and preferably the natural log In. We have also seen how maximizing 
ige log maximizes long-term compounded growth for our investments. From the 
Kahnemann and Tversky, we learned that their subjects tended to operate using 
ral log utility function based on their supposed wealth. The log model was clearly 
ent with the preferences of the test subjects and comprises a simpler model 
ves the arcane rule based complexity of prospect. theory. Finally, we discussed 
-breaking work of Daniel Bernoulli more than 200 years ago when he first 


38 OPTIMAL PORTFOLIO MODELING 


proposed using the natural log as the appropriate utility function for the solution to the 
St. Petersburg paradox. 

For the remainder of this book, we shall accept the premise that one part of our 
investment objective should be to maximize our expected In of the wealth of our portfolio 
at each point in time. In doing so, we keep in mind that the simple arithmetic average of 
our portfolio returns is our end of period average portfolio return. However, the ultimate 
long-term value of the portfolio return to us is its contribution to our In weighted wealth. 
Thus, any metric we use must include in it an explicit measure of the log weighted return. 
This latter is essentially equivalent to the long-run compound return, 

In this discussion, we have ignored the variability of returns. It seems clear that a 
return that is certain to always yield a given amount is preferable to an identical retum 
that is highly variable. If a minimum return is assured, it is possible to budget the return 
for future use. However, if the return is so variable that a loss is a significant possibility, 
no budgeting is possible, In other situations, a return that has little or no variability may 
allow the use of leverage, whereas a more variable return would preclude such an option 
because of the risk of ruin. 


DEFINING RISK 


The definition of risk has always been a bit controversial in the finance world. Certainly 
part. of the idea of any risk measurement is to evaluate the chance of losing. Along this 
line, one might be tempted to simply let risk be the probability that a given investment 
will lose money. Clearly, this is an interesting number, but falls short because it does not 
measure how much money is made or lost. 

Another measure of risk might be to look at the average loss. By itself, this metric has 
limited value as well. The trouble is that it contains no probability measure. Considered 
alone, it cannot be evaluated relative to how likely the loss is relative to the gain. 

Investors primarily make or lose money through changes in market value. Although 
interest earned and dividends collected all contribute to investment return, most of the 
higher returns and variability are due to market changes. As we discussed, the market 
seems to resemble a lognormal distribution arguably with fat tails. Because the attributes 
ofthe normal distribution family are well understood, most people in finance have settled 
on the variance and standard deviation as the best measures of the variability of returns. 

Any normal distribution is characterized by only two parameters—the mean and vari- 
ance. Clearly, the mean is always relevant to investors, and thus, it seems natural to look 
to the variance as a reasonable measure of the variability of returns. We should remem- 
ber that the standard deviation is simply the square root of the variance, so if we know 
one, we know the other by a simple calculation. For most financial work, the standard 
deviation is the preferred unit of measure because it relates to units that are naturally 


Objectives 39 


Por example, if we had an expected return of +10 percent and astandard devia- 
percent, then we can use the well-known properties of the normal distribution 
intuition. We can look at the standard deviation as a simple plus or minus 
of variability. For the previous example, we would expect our distribution to 
dat +10 percent, but with a variation of plus or minus 15 percent about two 
‘the time. This means we would expect returns as low as —5 percent to as high as 
cent some two thirds of the time. About one sixth of the time, we would expect 
‘greater than 25 percent, and the other one sixth we should get returns worse than. 


the properties of the normal distribution, we can make similar statements with 
confidence as well. For this, we simply make our plus or minus bands to 
standard deviations. Thus, the return of +10 percent plus or minus 40 percent 
® percent) is a range that we would expect to occur about 95 percent of the time. 
ut 5 percent of the time would we expect the outcome to lie outside that range. 
lear that the normal family of distributions can be very useful for analyzing vari- 
returns. In particular, the standard deviation is pretty much the accepted simple 
of risk. However, it does have its detractors who have legitimate arguments. 
common objection to the standard deviation as a measure of risk is that it in- 
beth upside price changes and downside changes. One can make the very reason- 
eument that only downside changes should be inchided, because those are the 
es that actually result in loss for a long-only investor. There can be little question 
S is a valid claim with respect to the narrowly focused question of losses alone. 
can carry that argument to extremes. Suppose a given portfolio gained 6 percent 
2 year. However, it was subject to wide swings during the intervening months. 
the portfolio lost no money at the end of the year, does that mean it was riskless? 
@t mean it had no more risk than investing in 6 percent Treasury bills during the 
nme? Common sense would say no. 

er, there is really a bigger issue. That issue is estimation. We use measures 
#0 estimate how much variability there is in a given return. If we want to evaluate 
ents of a given trading system via historical backtesting, we need the best estimate 
lity we can find. We also often wish to measure the performance of a portfolio 
- Again, the best measure of variability is needed. Sometimes we wish to estimate 
tility of a given stock. For that, we need the best measure of variability we can. 


fferent but related requirement is to measure significance. We often wish to know 
mg system is significantly better than average, or did its backtested results come 
merely through chance? It is a statistical fact that systems that show high variabil- 
a better chance of getting lucky than systems that are more stable. A confluence 
few large outliers can sometimes make a system or portfolio manager look much 
than is the case. This is especially true if the number of observations is limited. 


40 OPTIMAL PORTFOLIO MODELING 


Fortunately, there is a solution to both the estimation and the significance issues. 
Serendipitously, it turns out that the answer is to use the standard deviation as the me: 
sure of choice. Again, based on the well-understood properties of the normal family, 
the standard deviation is the best. estimator of variability possible under the criterion of 
maximum likelihood. In other words, the variability of the distribution is most likely to 
be well measured by the standard deviation. This is a very powerful argument in favor of 
the standard deviation as our preferred risk metric, 

To measure significance, one only needs the mean and standard deviation to perform 
al test. The ¢ statistic is given by this formula: 

(X= 1) 
cc 

Where X is the average of the data points X, j. is the population mean, c is the stan- 
dard deviation of the Xs and n is the number of observations in the sample. This statistic 
is governed by the ¢ distribution that is also commonly called the Sludent distribution. 

The story behind the name Student is a rather interesting one. William Sealy 
Gosset (1876-1937) was the master brewer at the Dublin Guinness factory. Presumably, 
because of his daytime occupation, he felt compelled to write his statistical papers un- 
der the pseudonym Student. Another theory has it that his employer considered the use 
of Statistics to be a trade secret and forbade him to reveal his techniques. In any event, 
Gosset developed what is now known as the Student distribution. By all rights, perhaps, 
it should be called the Gosset distribution. This case is yet another example of Stigler’s 
law of eponymy, which essentially says that great breakthroughs are rarely named after 
their original inventor. Perhaps Gosset’s own modesty in writing under a pseudonym is 
much to blame as well. 

In any event, the understanding of Gosset’s distribution was augmented by his con- 
temporary and correspondent R. A. Fisher into the full-blown f test. The ¢ test provides 
us with an excellent and well-understood means of testing for significant results, either 
in backtesting or in evaluation of portfolio management. 

Alternative ideas such as semi-variance and semi-standard deviation have been pro- 
posed. These metrics measure risk using only losses, but are otherwise similar to the 
usual standard deviation formula. The key difference is that the usual standard deviation 
formula includes all of the data. It simply has more information. By contrast, measures 
such as the semi-standard deviation defenestrate the winning portion of the data. Given 
the positive long-term upward drift of the market, the market usually rises 60 percent of 
the time in any given month. Consequently, these alternative measures wind up ignoring 
as much as 60 percent of their data. In particular, they completely ignore all lucky up- 
side outliers. There is sirmply no way for such metrics to evaluate when luck associated 
with high variability has been the underlying factor. They have systematically removed 
all information regarding such luck. 


(3.6) 


Investment Objectives 41 


Given a rational choice of a risk metric, it is now possible to define another type of 
investment goal. Specifically, this is the goal of minimizing risk. This arises in scenarios 
where an investor has a certain goal with respect to return but little incentive to exceed 
that goal. Rather, for this type of investor it may be more important to reduce the risk of 
not achieving the desired goal. 

A good example of this situation is a defined-benefit pension plan. Under this, the 
employer has defined a fixed benefit payable to the employees. It is important that the 
capital of the fund is able to achieve the desired return goal so as to meet the benefit 
obligations. 

If the fund does not meet the return goal, the corporation or governmental employer 
will have to dip into current income to make up the difference. So, in this case, the em- 
ployer may wish to use a strategy that targets a given acceptable return goal and seeks 
to minimize the risk of not achieving that goal. 

Clearly, the use of standard deviation is still the best way to measure the variability of 
returns around the projected mean. The well-understood statistics of the normal family 
also facilitate the development of models that can minimize risk or alternatively minimize 
the probability of falling below a given level of return. 


When statisticians speak of sampling real-world events, inevitably words like indepen- 
dent identically distributed (IID) variables are used. Critical to this is the concept of 
independence. This means that each new observation has no relationship to any previ- 
ous observation. In other words, the variables have no memory, in much the same way 
that a coin has no memory. Whatever the last result was in no way influences future 
outcomes. 

If the IID assumption is true, we have some very interesting results: 


1, The expected sum of 7 IID variables X is given by 


nE(X) 3.7) 
2. The variance (VAR) of n IID variable X is given by 
VAR (X; + Xou +... Xn) = VARXy + VARX +... VARXy, (3.8) 


However, when the assumption of independence is violated, we have a different re- 
sult. Variables that are not independent are said to be correlated. One measure of the 


42 OPTIMAL PORTFOLIO MODELING 


correlation is the covariance between two random variables X and Y. The covariance 
(COV) is given by 


COV(XY) = ECX — ECX))(EW — E(Y)) 6.9) 
When two assets are correlated the overall variance is given by 
VAR(X, ¥) = VAR(X) + VAR(Y) + 2 x COV(XY) (3.10) 


If two assets move together, they are said to be positively correlated. If two assets 
tend to move with positive correlation, then the covariance will be positive. If they move 
in opposite directions, the COV will be negative. If the assets are independent, the co- 
variance will be zero or near zero. 

We should also observe that positively correlated assets tend to increase the result- 
ing variance and hence the risk. But negatively correlated assets will tend to reduce the 
overall variance and hence the risk. So, to the extent that we can identify negatively 
correlated assets, we can help to reduce the overall portfolio’s level of risk. 


SUMMARY OF CORRELATION RELATIONSHIPS 


Table 3.4 summarizes the effect on risk that will be achieved for different correlation 
regimes. For example, if two assets are positively correlated, then diversification will 
yield only a minimal decrease in risk. 

A close relative of the covariance is the correlation coefficient, r. Essentially, the 
idea is to scale the covariance to a simple and intuitive number that will characterize 


TABLE 3.4 _ How Correlation Effects Comovement and Risk 


Correlation Comovement Effect on Risk (+) 
Positive Together Minimal decrease 
Negative Opposite Large decrease 
Zero Independent movements Moderate decrease 


“Effect on risk is for fixed position sizes as additional positions are added, Note that as 
diversification is increased using uncorrelated assets, the risk is reduced approximately as a 
function of the square root of n, where n is the number of equally weighted positions. 
Diversification with positively correlated assets, as is typical for most stocks, reduces risk at a rate 
less than the square root of n. Use of negatively correlated assets such as call options, written ona 
portfolio, will reduce risk at a rate faster than the square root of n. 


Investment Objectives 43 


the correlation relationship. So the correlation coefficient, r, is defined as the covariance 
divided by the product of the two variances. The formula is 


r = COV(XY)/VARCX)*VAR(Y) (3.11) 


This effectively rescales the correlation coefficient to a range from —1.0 to +1.0. A 
correlation near +1 indicates strong positive correlation while a number near —1 shows 
a strong negative correlation. A near-zero correlation tells us that the two variables are 
independent of each other. 


Modern portfolio theory (MPT) has defined various measures to evaluate the relationship 
and correlations between stocks. Beginning with Harry M. Markowitz's seminal paper in 
the Journal of Finance in 1952, modern portfolio theory has a long and rich history. 
Markowitz was the first to understand how to put measures of expected retum, risk, and 
correlation together to select an optimal portfolio, given his set of assumptions. 

The Markowitz model optimized the expected mean and variance using a standard 
arithmetic return model. One key realization was that the variance covariance matrix was 
critical to solving the problem in a multiasset environment where most assets are highly 
correlated. The model could be used to define an efficient frontier line of portfolios. Each 
point on the line corresponded to a portfolio that was the highest return for a given risk 
level. Alternately, one could view cach point as the lowest risk for the given return. 

The initial Markowitz model would require an estimate of the expected arithmetic 
mean and variance for each stock. The model itself is moot as to how such estimates 
should be developed. The two usual sources are an analyst's estimate or an historical 
estimate. The other required information is either the correlation matrix or the variance- 
covariance matrix. 

Essentially, the latter requirement means that the covariance or correlation between 
each pair of stocks must be estimated. The problem grows quadratically as the number 
of stocks grows. Thus, for 20 stocks we would require a table of 400 entries. Each entry 
is simply the pairwise covariance for each pair of stocks. We note that the main diagonal 
in which row i equals column j is simply the covariance of the stock with itself. This can 
be shown to be simply the variance of the stock itself. The number 20 squared gives us 
the 400 entries. 

However, to design an optimal portfolio for some 2,000 stocks on the New York 
Stock Exchange is a more daunting task. It would require a matrix of 4 million entries. 
During the 1950s and 1960s, this was beyond the main memory capacity of all computers. 
As a result, Markowitz’s work was proclaimed as a great theoretical breakthrough but 
inspired little in the way of practical applic: 


44 OPTIMAL PORTFOLIO MODELING: 


In 1959, James Tobin extended the work of Markowitz to include the idea of a cash 
position. This allowed the portfolio manager to explicitly use cash as a way to manage 
risk and return. It also allowed a means by which one could add leverage to the overall 
portfolio optimization problem. 

It was not until the work of William F. Sharpe in 1964 that the science of modern 
portfolio theory became practical. Modern portfolio theory is often abbreviated MPT. 
Sharpe's innovation was to realize that all stocks tended to be positively correlated (with 
some exceptions). Realizing this, he proceeded to devise a surrogate for the market that 
could reduce the number of correlations down from potentially millions to just the sin- 
gle correlation with the market. The usual market index at the time was the Standard & 
Poor's 500 index. The key enabling insight was the realization that essentially all invest- 
ment assets were correlated with the market. 

Sharpe's new theory became known as capiial asset pricing model (CAPM). At the 
time it was a marvelous breakthrough because it allowed investment managers to opti- 
mize their portfolios on the limited memory of the computers of that era. 

Essentially, the idea of CAPM is that each security has a correlation with the overall 
market and has a separate intrinsic variability that is not correlated with the market. 
Sharpe used a simple regression of this form: 


Stock change = A (market change) + a + w (3.12) 


where w is the uncorrelated error term. 

More highly correlated and presumably more volatile stocks might have a beta of 
2.0. This would mean that the stock moves twice a fast as the underlying market change. 
An average stock might move 1.0 times as much as the market, percentagewise. It would 
essentially move with the market. Very stable stocks might only move .50 times as much, 
or half as much. 

It has been shown that betas tend to persist. Using decile analysis, it was found that 
the highest-decile stocks generally tended to stay in the highest deciles in subsequent 
periods and that low-decile stocks stayed in the low deciles. 

The alpha coefficient is a different story. Alpha is a measure of how the stock tended 
to perform against the market. A positive alpha means the stock performed better than 
the market after its correlation with the market index had been factored out. A negative 
alpha means it underperformed. If alpha is near zero, the stock performed in line with 
the market. The trouble with alpha is that it does not persist. High-decile alphas in one 
period appear to be randomly distributed throughout all the deciles in subsequent peri- 
ods. Unlike beta, alpha does not appear to be a good predictor of future risk-adjusted 
behavior of the stock. 

One important concept to understand is that the CAPM model decomposes the risk 
associated with stock ownership into two broad classifications. The first is the part of 


Investment Objectives 


the risk that is directly correlated with the market. This is represented by the beta coeffi- 
cient in equation 3.12. The underlying rationale is that this part is not diversifiable. Each 
stock is correlated with the market to some degree. Thus, each stock is correlated with 
one another. Therefore, this is the correlated portion of the risk that all stocks share in 
common with one another. It cannot be diversified away. However, the portion of the 
variance that is attributed to the wu term in the above regression model is uncorrelated 
with the overall market. It can be diversified so as to reduce risk. 

We recall that things are linear in the variance and that we can decompose the total 
risk into its constituent components as follows: 


Correlated risk as measured by beta 
+ Uncorrelated risk as measured by squared errors (1) 


Total risk of the stock 


For example, a typical stock might have something like 65 percent of its total vari- 
ance explained by its relationship with the market. So, the calculation might look some- 
thing like this: 


65percent Correlated risk explained by beta 
+35percent Uncorrelated risk (Unexplained risk) 


100percent Total risk of the stock 


Of course, these percentages are merely representative and can vary significantly 
from stock to stock and from one time period to the next. The correlated risk is that 
fraction of the variance, expressed by R squared, which results from the regression cal- 
culation. The uncorrelated portion is based on the sum of the squared residuals, or the 
errors in which the regression did not quite fit the data precisely. 

The key understanding that Sharpe had was that the correlations between stocks 
could be neatly summarized by their correlation with the overall market. Thus, the beta 
represents that correlation with a single term. This can greatly economize the CAPM 
model as compared to the older Markowitz model where each covariance was required. 
The other key insight that Sharpe propounded was that the uncorrelated risk intrinsic to 
each stock was also largely uncorrelated with other stocks as well. The uncorrelated risk 
could be substantially eliminated through diversification. 

Thus, the beta term that relates to the correlated risk expressed the risk that was as- 
sociated with the market. It is commonly called systematic risk or market risk. Accord- 
ing to the theories of CAPM, this risk cannot be reduced through diversification because 
all the assets are correlated with the overall market. Because it cannot be eliminated, 
this risk is the only risk for which the investor will be rewarded for bearing risk. 

Several large cross-sectional studies have looked at this question from the stand- 
point of beta and the subsequent returns. Generally speaking, the theory holds up well. 


46 OPTIMAL PORTFOLIO MODELING 


Higher beta stocks do perform better than lower beta stocks, but at the price of greater 
correlated risk. 

Betas were also found to persist from one period to the next. Generally, the stocks 
that were in the highest deciles were in the highest deciles in later periods. From this, 
we can conclude that beta is a reasonable measure of correlated market return and of 
nondiversifiable risk. 

However, it. was a different story for alphas. Alpha represents the excess return a 
given stock had over and above its expected level of market return. If alpha is negative, 
the stock underperformed the market during the regression period. The large sample 
studies found that alphas did not persist. A high alpha in one period did not enable one 
to predict whether the next period’s alpha would be high or low. Thus, it was not possible 
to simply invest in stocks with high past alphas and expect them to continue to perform 
well. CAPM did not offer any free lunch to investors. Nor does it offer much of predictive 
value. Rather, it is a methodology that defines what diversification can do for the investor 
and what it cannot do. 


THE EFFICIENT FRO 
MARKET PORTFOLIO 


RAND THE 


One important result from CAPM was that the Markowitz efficient frontier was now cal- 
culable and still intact. The efficient frontier is that set of portfolios that is the best com- 
bination of risk and return for any given level of risk or for any given level of return. 
Thus, an investor could select a point on the efficient frontier and have a realistic expec- 
tation that his or her portfolio was optimal, given the expectations and assumptions that 
were made. Further, the theory provided the assurance that portfolio was optimal for the 
given risk and reward level. 

Another important conclusion was that one could pick a point on the efficient fron- 
tier and simply vary cash and leverage to adjust the risk and return along a straight line 
passing through any given point on the frontier. Thus, it was possible by choosing an 
appropriate tangent line, to find the optimal portfolio that would allow one to perform 
better than all others (see Figure 3.1). Not too surprisingly, the CAPM proponents argued 
that this portfolio was none other than the market portfolio weighted by capitalization. 
By holding some cash, one could reduce the risk of the market portfolio to any given 
level. Alternatively, through the use of leverage, the return could be increased at a corre- 
sponding increase in risk. 

These findings of CAPM came into wide acceptance as added numbers of MBAs were 
trained in the approach. Asa result, by the 1990s many new index funds were coming into 
vogue. It was argued that investing in widely diversified portfolios was nearly impossi- 
ble for the average investor, and thus, owning the optimal market portfolio could not be 
done. Another, perhaps more persuasive argument was that the market portfolio és the 


Investment Objectives 47 


Expected Returns 


=-r--+ with riskless asset 
— without riskless asset 


Standard Deviation of Returns 


Figure 3.1 Efficient Portfolio Frontier 


optimal portfolio. Thus, investing in the entire market gave one the best possible port- 
folio. With the advent of the new index funds and even newer exchange traded funds 
(ETFs), the individual investor could now own the market portfolio with relative ease. 


Essentially, CAPM had boiled portfolio theory down to a simple ratio between the mean 
expected return and the expected variance. This ratio defined the slope of the capital 
market line. However, given that it was now possible to measure return and risk as vari- 
ance and the nondiversifiable market correlation as beta, there soon arose ideas con- 
cerning how to measure the management skills of portfolio managers. 

For this purpose, Sharpe developed the Sharpe Ratio. It is simply expressed as the 
following formula: 


Sharperatio = (u—r)/o (8.13) 


where: 
w isthe mean portfolio retwn 
r isthe riskless rate of return 
o isthe standard deviation of the portfolio 


48 OPTIMAL PORTFOLIO MODELING 


The mean less the riskless short-term Treasury bill rate is known as the excess re- 
turn. Short-term T-bills are considered a riskless asset. Thus, the excess return is the 
excess return that a portfolio manager received for bearing risk. The excess return is di- 
vided by the amount of risk taken, which is measured by the standard deviation. There- 
fore, we can interpret the Sharpe ratio as the amount of excess return per unit of risk 
taken. 


LIMITATIONS OF MODERN PORTFOLIO THEORY 


Modern portfolio theory is an excellent model to enable one to find efficient portfolios. 
It can help to identify the best portfolio for a given level of risk and reward as measured 
by mean and variance. However it has its limitations. 

In particular, it assumes a normal or lognormal distribution. Although the markets 
appear to exhibit distributions similar to those, this assumption may break down in the 
limit. To the extent that the underlying distribution is fat tailed, then the optimal portfolio 
may well depend on higher moments of the underlying distribution, as some have argued. 

The variance only incorporates the second moment. It is based on the sum of the 
squares of price changes in the underlying security. Higher moments may be necessary 
to adequately describe risk and return with respect to the preferences of real investors. 
Alternatively, better models may explicitly assume an underlying empirical distribution 
that has no closed form description. 

Some of the more recent GARCH models and their many variations have been intro- 
duced to explain the fact that the variance does not seem to be stationary. In fact, the 
variance can often double from one period to the next. The GARCH models attempt to 
deal with this phenomenon. 

Inasense, the variance of the variance may be a very interesting statistic in its own 
right. Since the variance is a squared variable, the variance of the variance would be an 
X to the fourth power variable. However, we already have the well-understood kurtosis 
statistic based on fourth-power calculations. 

Others have attempted to model the markets using the skew—a third-power statis- 
tic. These attempts are probably misguided in the sense that using the lognormal dis- 
tribution adequately accounts for the skew because the lognormal distribution appears 
skewed when viewed from arithmetic space. The symmetrical bell-shaped distribution 
only appears when the data is presented on a logarithmic scale. Nevertheless, the study 
of the relationship of the markets with respect to the skew of the underlying distribution 
continues to be an active field. 

Another possible flaw in modern portfolio theory is that it does not directly ac- 
commodate an investor's logarithmic utility function. Instead, it presupposes a utility 
based on mean and variance. In effect, the intersection of the capital market line and the 


Investment Objectives 49 


investor's presumed linear utility for risk return is taken as the optimal portfolio for that 
particular investor. However, this flaw can be handled to some extent by incorporating a 
logarithmic function directly into the portfolio model. 

Although modern portfolio theory has a few flaws, it still represents the framework 
of the best attempt to deal with the challenges of building an optimal portfolio in a ratio- 
nal and intelligent fashion. 


“Play the game for more than you can afford to lose. .. 


only then will you learn the game.” 
—Winston Churchill (1874-1965) 


orders as a kind of free lunch. It is often billed as a technique that allows one 

to control risk without any associated cost. Some even claim that it increases 
returns. This chapter explores the myths associated with this technique and debunks 
many of the claims made by supporters. 

To the author’s knowledge, these results have never appeared in book form before. 
The author does not wish the reader to simply take his word for it that the concept of 
stop-losses has been largely oversold to investors. Because the results are quite innova- 
tive, this chapter offers mathematical proofs of the results. In keeping with the minimum 
math theme of this book, the proofs can be understood with only an understanding of 
the reflection principle and simple high school algebra. 

The section discusses and shows how stops affect the mean return, probability of a 
win or loss, and the variance of returns. The effect on the more esoteric skew and the 
kurtosis is also demonstrated for completeness of exposition. A mid-chapter summary 
of these results serves as a convenient refresher for the latter part of the chapter. 

Using the results concerning stop-losses naturally leads to a discussion of the impor- 
tant practical considerations when modeling stops, as well as knowing when to use them 
and when they should be avoided. The results for stop-losses naturally can be extended 
to the concept of stop profits and fixed-profit targets as well. Finally, the discussion 


Me« authors, brokers, and market commentators espouse the use of stop-loss 


51 


52 OPTIMAL PORTFOLIO MODELING 


considers how the use of stops of either variety creates a return distribution that is very 
similar to the returns from using puts and calls. 

Winston Churchill was a famous political leader but not particularly noted as an 
investor. However, there is much to be learned from the quote at the beginning of this 
chapter with respect to balancing risk and reward. For him to lay it all on the table was 
the only way to play the political game. 

In politics, in which an election is either won or lost, there is no middle ground. In 
that arena, perhaps Churchill's advice makes sense. From the perspective of the global 
power struggle of World War II, again his wisdom seems true. It seems unlikely that Allies 
and the Axis could have negotiated a lasting peace. After all, that had already failed after 
World War I. His genius was that he had the moral fortitude to bet his nation on the 
outcome of the war. 

Butin the investment arena, the investor has more choices. It is eminently possible to 
invest part of your portfolio in one stock and even to spread it oul over many stocks and 
different asset classes. Part of this idea of balancing risk and reward can be described 
as using stop-loss orders to control risk. Much of this chapter deals with the question of 
whether stop-loss orders work. 


STOP-LOSS ORDERS 


A stop-loss is an order placed with a broker that is used to exit a trade if a certain adverse 
price is hit. For example, a trader may buy a stock at. 50 and simultaneously place a stop 
order to sell at market if a price of 48 or lower is seen. His goal is to protect himself from 
a loss of larger than 2 points. Most frequently, this type of trading tactic is used as a form 
of risk management. 

There is a considerable lore of stock market literature that advocates the use of stop- 
losses as a sort of free lunch money management system. This school of thought argues 
that stop-losses will cut your losses but allow your profits to run. On its face, it seem: 
though such tactics are a foolproof way to reduce losses and retain all of one’s profits. 

However, stop-loss orders have their drawbacks. When the price target is hit, the 
stop sell at market order becomes executable immediately. However, there is no guar- 
antee that the order will be executed at or even near the 48 price. Sometimes there are 
gaps in trading that can mean that the sell stop 48 is triggered when a price of 46 is hit, 
because no trading occurred from 49 through 46. Thus, the hapless trader may be left 
with an execution at 46 or so, instead of the putative 48 price target. 

The goal of this book is to present the ideas and concepts of portfolio modeling in an 
easily accessible intuitive manner. Wherever possible, the use of higher mathematics has 
been avoided. In particular, proofs are generally omitted in this work in the dual belief 


Modeling Risk Management and Stop-loss Myths 33 


that they have been adequately covered elsewhere and tend to obscure the essence of 
the idea itself. However, to the best of the author’s knowledge, the following sections 
discuss ideas pertaining to stop-losses that have not been derived elsewhere. Thus, it 
was not possible to omit the proofs. Otherwise, the earnest reader would be left in the 
hapless position of taking the author’s word for the results in the face of so many market 
experts who espouse divergent opinions. Fortunately, the proofs offered are based solely 
on the very intuitive reflection principle augmented by nothing more than simple high 
school algebra. Hopefully, even nonmathematical readers will find the arguments and 
proofs understandable and intuitive. 


STOPS: EFFECT ON THE MEAN RETURN 


Once a stop-loss order is executed, one of two things can happen. The stock can go up or 
it can go down. Naturally, this begs the question as to whether it was a good idea to enter 
the stop order in the first place. Suppose we make the naive but neutral assumption 
that it is equally likely for the stock to go up or down, Then half the time the investor 
would have recovered some or all of his losses. The other half of the time, the losses 
would continue to mount. Thus, there is no clear advantage to the investor. Nor is there 
any obvious augmentation to the expected average amount to be won or lost by using a 
stop-loss. 

In Figure 4.1, we see a plot of the normal distribution with the stop-loss point s 
defined. The region to the left of the stop-loss point is the region that the investor hopes 
to avoid through the use of his stop-loss order. 

The lack of increased expectation is predicted by the reflection principle and the self- 
similar properties of the normal distribution. Suppose a random walk with a symmetric 
distribution is currently at a loss point that we shall call s. Then the number of paths that 
go up from s is exactly equal to the number of paths that go down from that point. The 
downside distribution is an exact mirror image of the upside distribution from that point 
on. If we recall the picture of the normal distribution and a loss point s that is below the 
mean, then the tail to the left of point s represents the distribution that results from all 
of the paths that continue downward. 

But the reflection principle says there is another equal and opposite set of paths that 
were reflected in the upward direction. The upside distribution is an exact mirror image 
of the avoided downside distribution. 

In Figure 4.2, we see the avoided upside recovery area as well. It is shown as 
the reflected area to the right of the stop-loss point. In accordance with the reflection 
principle, we see that it represents an exact mirror image of the distribution that was 
avoided via the stop-loss order. 


54 OPTIMAL PORTFOLIO MODELING 


04 4 


034 


0, sd=1) (x) 


02-4 


01-4 


function(x) dnorm(x, mean 


0.04 


=3 2 = 


So 
nw 
wo 


Figure 4.1 Normal Probability Distribution 


For each point in the downside distribution, there is an equal and opposite point in 
the reflected upside distribution. The probability of each point occurring is equal. Each 
point is equally distant from point s, the original stop-loss point. Suppose we were to 
pair each resulting point in the downside distribution that went down, x, with its mirror 
image in the upside distribution that went up by the same amount. The avoided loss on 
the downside would be 


S=H 
The avoided more favorable result in the upside portion would be 
sta 
Thus, the combined total for each matched pair is given by 
s—xu+s+u=2s (4.1) 


The result is always 2s because the a terms always cancel each other out. But we 
are dealing with point pairs composed of two points. So we must divide by 2 to find the 


Modeling Risk Management and Stop-loss Myths 55 


044 
= 
1 034 
B 
o 
it 
3 
& 
= 024 
E 
5 
2 
s 
= 
= 
gS 
S 0154 
i A 

0.04 

T T r 7 T T ; 
-3 -2 1 0 1 2 3 


Figure 4.2 Normal Probability Distribution 


result per trade. After dividing by two, the average result is simply a loss of s per path. 
This is the same result that would have been achieved without the use of a stop-loss. 

Therefore, we can say that there isno net contribution to expected or average return 
by adding a stop-loss. This probably comes as a surprise, if not an outright shock to most 
traders. It is certainly at odds with the conventional wisdom of Wall Street. 

However, it is good to remember that much of the lore of Wall Street has been pop- 
ularized by sell-side brokers. Such so-called wisdom has often served to generate the 
commissions and trading volume that the Street needs for its very life blood. After all, 
buy and hold is bad for a commission-based business. The only strategies that the aver- 
age broker is likely to espouse without cutting his own throat are those that will increase 
the level of active trading by his customers. 

For this purpose, a stop-loss is perfect. Not only does it appear to make sense, but it 
actually generates two commissions. The first commission is the obvious one, when the 
position is sold through the stop-loss. The second is the commission that is generated 
when the money is reinvested in a new position. 


56 OPTIMAL PORTFOLIO MODELING 


There is another hidden cost, as well. Over time, the equity market tends to rise. 
The phenomenon is often termed the long term upward drift of the market. Thus, in the 
absence of any superior timing ability, the time out of the market represents lost return. 
As an unintended consequence of using stop-loss orders, the investor typically is also out 
of the market and thus loses out to some extent on the long-term upward drift, while a 
new investment is being evaluated. 


STOPS: EFFECT ON THE PROBABILITY OF GAIN 


Another consideration in the use of stop-loss orders is how stops mutate the probability 
of a gain or a loss. We shall see that adding a stop-loss always increases the probability 
of a loss and reduces the probability of a gain. 

At its simplest we can see that when a stop-loss is executed, it is a loss. Thus, no 
Joss events are avoided by a stop-loss strategy. Only the size of the largest losses can 
be reasonably argued to be reduced. Conversely, a stop-loss strategy causes possible 
profitable recovery events to be avoided. Thus, qualitatively, the probability of aloss can 
only increase through the use of stop-loss tactics. In the following discussion, we will 
attempt to address this idea in a quantitative manner as well. 

Again, we shall invoke the reflection principle. The reader is referred to Figure 4.2. 
For a given stop-loss s it is easy to see that the tail to the left of s represents the avoided 
downside outcomes. The area under the curve represents the probability of being below 
s if the stop-loss had not been placed. However, again the reflection principle assures us 
that there is an equal mirror image distribution to the right, as well. Theoretically, the 
tails of the normal distribution go off to infinity. However, the tails become vanishingly 
small as they do so. But the point to understand is that some part of the right tail that 
is avoided by a stop-loss would have allowed the trader to recover back into profitable 
territory. Thus, the amount of the tail area under the curve that stretches back into profit 
territory is the amount by which the probability of a gain will be reduced, and the chance 
of loss increased. If the trader uses stops that are close to the current market price, this 
adverse effect on probability is considerable. If the stop-loss point is very distant, the 
effect on probability may be quite negligible. However, there is always an effect, and it 
always reduces the probability of a gain. 


STOPS: PROBABILITY OF BEING STOPPED OUT 


Suppose we have a stop-loss at price s. From our handy tables of the normal distribution 
contained in Appendix 1, we can tell that the probability of the price being in the tail any- 
where below point s is given by some number p if there is no stop-loss order in place. It 


Modeling Risk Management and Stop-loss Myths 57 


0.44 


0.34 


function(x) dnorm(x) (x) 
° 
ib 
L 


014 


0.04 


Figure 4.3 Normal Distribution Stop = 1.5 sd 


is simply given by the cumulative probability density function of the normal distribution 
up to price s. In this case, we would treat s as a normalized value or Z score. For exam- 
ple, if s is placed at one standard deviation below the starting level, the normal tables 
would tell us that about 16 percent of the distribution lies below the 1 standard deviation 
level. 

At first blush, it would seem that this is the probability of being stopped out. But that 
reasoning is not correct. Once again, we recall the reflection principle. For each of the 
paths that wound up in the leftmost tail, an equal and opposite path came back above the 
stop-loss point s. According to the reflection principle, the area under that curve must be 
equal. The reader is referred to Figure 4.3 for a visual illustration. In our example, this 
would represent another 16 percent of the total probability distribution. Thus, we have 
16 percent in the leftmost tail lower than point s and we have another 16 percent in the 
reflected mirror image of that tail. This gives us a total of 32 percent of the time that the 
trade will be stopped out with a loss of s. 


58 OPTIMAL PORTFOLIO MODELING 


Again, the reflection principle assures us that if the probability of being at or be- 
low s without a stop is given by p, then the probability of being stopped out is exactly 
double that, or 2p. Placing a stop-loss order exactly doubles a trader's odds of being 
at or below the stop-loss point. This is a profound result that traders should consider 
carefully. 


STOPS: EFFECT ON VARIANCE AND STANDARD 
DEVIATION 


To examine the effect of stop-loss orders upon the variance and therefore, also its close 
cousin, the standard deviation, we will once again invoke an analysis based on the re- 
flection principle. For a given stop-loss level s we again analyze the distribution based on 
the matched pairs of reflected points. Again, the reflection principle assures us that there 
is an equal number of matched points above and below the stop-loss point. Thus, for a 
stop-loss at s and a given displacement above and below that point we have: 


Downside avoided loss : s — x 
Upside avoided gain : s + x 
The variance and standard deviation formulas are based on the sum of the squares of 
the returns values. So the only thing we need to consider is the contribution to the sum 


of the squares. 


Assuming no stop: We set s to a minus value, reflecting that it is a loss. Then taking 
squares, we have 


(-s — a) + (@- sy 
(s? + 2su + x) + (s? — 25+ 2:7) 
Combining we have 
2(s? +2") (4.2) 
Assuming a stop is in place, the sum of the squares for the two paths is simply given by 


2s? (4.3) 


Modeling Risk Management and Stop-loss Myths 59 


Clearly, the difference between equation 4.2 with no stop and equation 4.3 with a stop, 
will be 

2a? (4.4) 
Because x? is always a positive number, we have the important result that: 


The variance and standard deviation will always be reduced as a function of that 
amount for each set of matched pair points. Trivially, the sum of all the matched pairs 
will always show a positive reduction. 


To evaluate the impact of a stop-loss order on the skew of the return distribution, we 
shall use a similar technique. We first decompose the problem into the two cases. They 
are the case in which no stop is used and the case in which a stop-loss is employed. We 
compare the difference in the skew between these two cases. 


For the C 


We can derive the effect on skew with a similar matched pair calculation based on the 
cubes of the matched points. Here we have the cubes of the avoided loss and avoided 
gain: 


of No Stop-loss 


Part 1 + Part2 


(-s -2)° + @-s)* 


Expanding, we get 
Part 1: 


(4.5) 
Part 2: 


3 + 8s2x — Bx? + 2:3 (4.6) 


Combining the two parts yields 


—2s? — 652? (4.7) 


60 OPTIMAL PORTFOLIO MODELING 


For the Stop-loss Case of Stop-loss 
We have a loss of —s and the skew is based on the contribution of the cubes of the 
deviations from the mean. The resulting skew for both sides is 


=25% (4.8) 


Clearly the difference is —6sx?. Given that x? is always a positive number, we see 
that —6s is always negative. Thus, the skew is always reduced by the addition of a 
stop-loss order to the strategy. Many traders consider a positive skew to be a desirable 
attribute of a distribution of returns. 


EFFECT ON THE KURTOSIS 


Using our usual definitions for avoided loss points and avoided gain, we can calculate the 
kurtosis or fourth moment of the normal distribution. Starting with the expanded form 

of equation 4.2 from the variance proof, we proceed as follows: 
(s? 42sr 4 22)? = 5! 4 ds3x 4 4s2a? + Asa? 4 x! (4.9) 

and 
s? — 25a + x? + st —4s%a + 6s?x? — 4sa9 + at (4.10) 
Combining equations 4.9 and 4.10 we get 

2st + 12s%a? +24 (4.1) 


For the trade with a stop-loss, the fourth moment contribution for both sides is: 


2st 


Clearly, the difference between equations (4.11) and (4.12) is 
12872? + at (4.12) 


Because all of the signs and terms of the difference are positive, it is clear that the 
effect of adding a stop-loss is to reduce the kurtosis. 


Modeling Risk Management and Stop-loss Myths 61 


In the foregoing discussion, we see that stops are not all they are supposed to be. They 
have some benefits and some drawbacks. There are still many who prefer stops. They 
should remember that for arandom walk, based on a normal distribution, five points are 
true on a per-trade basis: 


1. Stops will neither help nor hurt your expected (average) return. 

2. Stop-losses will double the probability of being at or below the stop value. Your 
probability of loss will invariably increase with stop-losses. 

3. The reduction in probability of wins will result in more runs of losses, and thus, 
overall, drawdowns over a succession of trades will tend toward the original value. 


4. The variance and standard deviation will decrease. This represents a real reduction 
in risk. 

5. The skew of the returns will become more negative resulting in approximately the 
same number of large gains with more numerous but limited losses. 


Using a stop will mutate the distribution of returns. The resulting distribution will 
tend to somewhat like a normal but with many losses piled up at the stop loss point. 

There is a tendency to return to the original normal distribution when the trades are 
considered collectively over time. Under reasonable assumptions, all distributions tend 
to return to a normal distribution in the limit. If the underlying process is additive, then it 
will tend to return to the normal distribution. If the native process is multiplicative, it will 
tend to return to the lognormal distribution in the limit. In other words, the stops make 
little difference in the long run, but do alter the shape of the distribution in the short run. 


When modeling stops, it is important to remember that the process is very path depen- 
dent. Thus, the model must accurately duplicate and include the entire path. In the fore- 
going, we have assumed the normal distribution as our theoretical model. As we know, 
the normal distribution conforms to an additive model of the price formation process. 
However, the results also apply to a lognormal distribution if viewed from logarithmic 
space. 

When modeling stops, one must be careful to include highs and lows for the day in 
the model. It is never sufficient to simply use closing prices and assume the stops were 
or were not executed. In addition, some estimate must be made to account for slippage. 


62 OPTIMAL PORTFOLIO MODELING 


Slippage is always caused by the fact that there is a gap between the bid and the ask on 
the exchange at any given time. Thus, it is a rarity that, afier a price of 48 is hit, one could 
actually get: out at 48. Tt will usually be somewhat lower. The analyst must also take into 
account gap openings. 

All things considered, stops are hardly the foolproof free lunch that some have made 
them out to be. It is quite clear that stops involve trade-offs in the short term and have 
only a modest impact on the distribution of returns in the long-term scheme of things. 

In this chapter, we have focused on the theoretical distributions in order to derive 
some properties concerning how stops will change the distribution. Naturally, the focus 
is on intuition. Readers who wish to use the empirical distribution will probably prefer 
to model the use of stops using actual market data. 

Any such models must religiously include reasonable estimates for slippage as a min- 
imum. The entire path must be considered as well, including highs and lows reached 
during the period. Additionally, in the event of a gap opening, the actual open price must, 
override the presumed stop price. Failure to take into account all of these details has led 
many an analyst to overestimate the benefits of stop-loss orders, 

Il is only after completing such an exercise that one can realistically evaluate the 
impact of stop-loss orders on the trading results. However, it is quile likely that empirical 
studies will follow the results of this chapter—on a qualitative basis, al least. 


IDENTIFYING WHEN TO USE STOPS AND WHEN VOT TO 


Clearly, we have shown that the use of stop-loss techniques is far from the panacea many 
on Wall Street make it out to be. However, it does have its place. 

Clearly, we have seen that using the standard normal distribution model, the ben- 
efits of stops are lukewarm at best. In order to identify the situations when one might 
profitably employ stops, it is necessary to explicitly test that question using the empirical 
distribution. Any such test must include explicit recognition and testing of intraday (or 
intra period) highs and lows. If one fails to include these extreme values in price, any 
study is fatally flawed. 

One very useful concept in studying stop-losses is the idea of the maximum adverse 
excursion. Essentially, the idea is that. when we evaluate an historical data set of individ- 
ual trades, we not only look at the outcome of each to its conclusion, but we look at how 
each trade did at all intermediate time frames as well. 

Suppose we bought a stock at 50 and eventually sold it at 45 for a 10 percent loss. 
However, if at one point during the time we owned the stock, it went down to 40 at its 
nadir, the maximum adverse excursion was —10 points, or —20 percent. This concept 
can be useful when studying the effect of stops on a trading strategy. 


Modeling Risk Management and Stop-loss Myths 63 


In order to model stop-losses empirically, we must first set the stop rule. Usually this 
is a fixed percent loss. Some technical analysts like to use a fixed multiplier times the 
Average True Range (ATR). Others use trailing stops that move up if the price increases. 
Whatever method is chosen, it must be rigorously defined and tested. 

After the method is defined, we (or our computer) examine each individual trade 
following its path during each time period to see if the stop would have been executed 
because the lowest price for the period slipped below the stop-loss point. 

At this point, we must calculate a price at which the stop would have been executed. 
This is trickier than it might seem at first glance. As a minimum, the price should include 
something called slippage. This is the amount by which the actual execution price will 
slip below the stop price. As a minimum, the slippage should include the typical bid-ask 
spread. 

However, often this is not enough. Many times, a stock will open for trading at a 
price much lower than the previous day. This is known as a gap opening because of the 
telltale gap that shows up ona chart of the stock. Often, such a discontinuous price jump 
will occur as the result of news. At other times, it is simply because a large seller wants 
to unload his stock on no obvious news. In any event, in order to model gap openings, 
the trader must recognize that the gap occurred and that the open price is the price to 
use, not the stop-loss price less slippage. 

The problem of gaps is even more insidious when we consider that trading gaps can 
occur during the trading day. Often, they are the result of fast markets. Sometimes they 
result from news. Other times, the cause is the penetration of some important technical 
level. Traders who place their stops based on obvious technical levels such as previous 
highs or lows may be more susceptible to having their stops run by floor traders, much 
to their chagrin. 

In any event, modeling stops should include the consideration of intraday gaps, as 
well as opening gaps. The only effective way to do this is to use tick data that considers 
every trade as it occurs. Such data is more expensive and more voluminous. 

Although there are numerous problems and pitfalls associated with modeling stops, 
it can be done. However, we need to consider the fact that the model essentially helps us 
to analyze only one value for the stop-loss. The obvious question is whether the value we 
chose is the best. Naturally, to do this, we must define what, is meant by best. Is it, maxi- 
mizing the mean return, minimizing the standard deviation, or improving the probability 
of success? 

Another issue is that when we optimize the stop-loss value, we are creating sev- 
eral other problems for ourselves that are undesirable. First, we are trying many differ- 
ent models with the stop-loss value s allowed to vary. Statistically speaking, this can be 
viewed as either reducing the degrees of freedom or adding additional hypotheses. The 
point is that when we do this, we are increasing the chances that the result we ultimately 


64 OPTIMAL PORTFOLIO MODELING 


obtain will be spurious. We make it more likely that the result is due to chance because 
we tried more ways until one finally worked. 

Any statistical tests that we perform on the data will be reduced in power because of 
the need to adjust for the added number of tests tried. Potentially, even more dangerous 
is the possibility that the needed adjustments will not be made by the novice trader and 
that a result will be accepted for live trading that was wholly the result of chance. 

As an example, suppose we had a data set and we found that the optimal stop value 
was —17.3 percent. The tests showed a profit at that level. But when we look at the other 
values tried, the trading method showed a loss at the —17 percent and at the —18 percent 
levels for the stop-loss. It simply is not credible that the testing actually found a sweet 
spot at —17.3 percent. Rather, it is more likely that the so-called sweet spot is due to in- 
clusion or exclusion of one ora few observations that skewed the result. Itis roughly akin 
to a trading system that includes the rule “Buy Google at 100.” Well, the rule looks good 
on paper when we backtest it, but going forward it is likely to offer us no useful trades. 

‘This issue is an example of a broad class of issues that some have dubbed overjitting, 
and others call data mining. Whenever we add more rules but each added rule only serves 
to eliminate a few observations, then it is time for the red flag to go up. The best policy is 
to keep the number of rules and fitted parameters to as few as possible, and the number 
of observations in our data as large as possible. 


STOP-PROFITS 


There is another order strategy that is quite different from the stop-loss order. That is the 
stop profit or profit target. It can be implemented as a stop order to sell at market or as a 
limit order at a given price target. 

Either way, we can analyze the resulting distribution in just the same way that we 
did with the previous study of stop-loss orders. Each of the previous proofs also has an 
analogous proof based on the reflection principle applied to the profit target scenario. 
Because the proofs are essentially mirror images of those presented previously, we shall 
omit them here and leave them as an exercise for the reader. 

However, the implications of the profit target are worth discussing and are presented 
as follows: 


¢ Profit targets will not alter the expected return from a trade. They will neither add to 
nor subtract from the average return. 

© Profit targets will reduce the variance and hence the standard deviation per trade. 
This can be a real risk reduction. 

« The skew of the distribution will be altered in a similar, but inverse, way to that for 
stop-losses. 


Modeling Risk Management and Stop-loss Myths 65 


e The kurtosis will also be changed in an analogous way. 

e The probability of a profitable trade will increase because the downside reflected 
paths from stop point s will have been eliminated. Some of those may have resulted 
in losses. 


It is worth noting that investors who are so inclined may wish to utilize both stop- 
losses and profit target strategies in their trading. The combination of the two can be 
expected to have no impact on the mean return, but will result in a greater reduction in 
standard deviation than either alone. 


There are many similarities between put and call options and stop-loss or profit target 
strategies. In this section, we shall discuss the similarities, and some differences as well. 

A call option gives the owner the right to buy 100 shares of stock at a fixed price 
for a fixed period of time. The fixed price is known as the strike price and the time is 
standardized by expiration months. Generally speaking, most stock options expire on 
the third Friday of the expiration month. A call buyer pays a price known as a premium 
to purchase the call option. 

A put option is the right to sell 10 shares of stock at the strike price at any time up 
until the expiration period. The holder of a call option is not required to exercise this 
option. He or she will only do so if it is favorable. 

If we compare a call option to a stop-loss strategy, we see that both offer a fixed 
downside loss limit. For this to be precisely true, we temporarily assume that the stop- 
loss is actually executed at the stop-loss price. In effect, the premium paid for the option 
acts as a limit on the loss. The buyer of a call can lose no more than his premium invest- 
ment if held to maturity. Unlike the case for the stop-loss order, the loss to a call buyer 
is precisely and strictly limited. As discussed earlier, a stop-loss is not guaranteed to be 
executed at the stop price. 

Inasimilar way, the probability distribution for the returns from a call option have a 
similar shape to the returns for a stop-loss analysis with a similar loss characteristic. Re- 
alization of these facts prompts one to wonder why more traders do not use call options 
instead of stop strategies. 

For traders who employ profit target strategies, we find that the distribution of re- 
tums is quite similar to selling a put option. For example, suppose we sell a three-month 
put with a strike price of 50 for a premium of 5 when the current market price is 50. Our 
maximum profit is 5 if the stock goes to 55 because we get to keep the premium received 
for the put. In contrast, the profit potential is strictly limited to 5. It can never be any 
more. 


66 OPTIMAL PORTFOLIO MODELING 


However, this is the same profit profile that a trader who employs a profit target: 
strategy would have. This trader has a profit outlook of a maximum of 5 if his 55 target 
is reached. In addition, he has a risk profile all the way down to a loss of 50 if the stock 
improbably drops to zero. This is the same risk that a put seller has. Hopefully, the reader 
can see that both stop-loss and stop-profit or profit target strategies are analogous to 
option strategies and can be modeled in the same way. 

Conversely, we note that one can model puts and calls in much the same fashion if 
the positions are held to maturity. Thus, the probability models and arguments discussed 
in this section extend to options in a natural way. 

One should hasten to add, however, that the resulting distribution for the returns on 
puts and calls themselves is fundamentally nonnormal. Again this is similar to the fact 
that using a stop-loss or profit target strategy results in a decidedly non-normal distribu- 
tion as well. Thus, anyone developing such a model should be aware of that fact. 

In the same way, the distribution for the stop-loss strategies and for the profit tar- 
get strategies also results in a nonnormal probability distribution. Again, the trader is 
cautioned to avoid methods that assume the normal distribution for analyzing these 
situations. 

Readers are urged to review the example program CD-ROM and the sample programs 
used to generate some of the charts in this book, In particular, the graphs relating to the 
normal distribution and the reflection principle can be very helpful in understanding this 
subject. 


“Money begets money.” 


—Giovanni Torriano 
Foul cankering rust the hidden treasure frets, But 
gold that’s put to use, more gold begets. 
—William Shakespeare, 
Venus and Adonis, 1593 


ment return is an important subject in its own right. Some would argue that max- 
imizing return is the only proper investment goal. This chapter should be very 
interesting for those folks. 

For the rest of us who prefer to consider both return and risk in our investment 
decision making, this chapter is a start. It deals with how to maximize the compounded 
return on your portfolio. In a sense, this enables us to identify an upper bound on our 
tisk taking. We shall see that there is a limit to risk taking. If we go beyond that limit, we 
will reduce return and increase risk. It is the worst of both worlds. 

The basic concept of relative return is discussed as a foundation for this chapter. 
The essential difference between a simple average return for the stocks in a portfolio 
and the compounded returns on your portfolio are considered. Both types of averaging 
are needed, and the reader will see where each is appropriate. 

Previously, we debunked some of the myths surrounding stop-loss orders. But that 
leaves us with a vacuum as far as good money management practices are concerned. One 
emphasis in this chapter is on the concept of proper position sizing as the best form of 
risk control. In addition, we shall see how and why position sizing is critical in order to 
achieve optimal compounded returns. 


E veryone wishes to make money on his or her investments. So maximizing invest- 


67 


68 OPTIMAL PORTFOLIO MODELING 


The chapter features a model that will maximize the long-term compounded return 
ona portfolio. The models have limitations that are thoroughly explained in the text. 

From the idea of a model based on the theoretical distributions, the presentation 
moves on to consider the real-world model of the empirical distributions with its atten- 
dant warts and flaws. Modeling the empirical distribution is extremely important, in order 
to be able to capture the real-world fat tails phenomenon. The techniques discussed also 
consider the all important issue of correlations between the variables. 

Out of this frame work, the enhanced maximal investment formula is derived. The 
finale of the chapter discusses the reality that the maximal investment formula idea is 
cursed with large drawdowns and swings of capital that many investors will find unac- 
ceptable. Some ad hoc techniques to control the drawdowns are discussed. 

Readers are encouraged to play along at home with the companion CD-ROM in- 
cluded with this book, 


OPTIMAL COMPOUND RETURN MODELS 


This chapter discusses the steps required to build and optimize a portfolio model that 
achieves the maximum possible return. In general, it is possible to optimize only one 
variable at a time. Thus, any model that seeks to optimize must somehow combine all 
of its objectives into a single function or formula. This function is called the objective 
function. It is essential in all optimization problems. 

For the purpose of maximizing compounded returns, our objective function will 
most easily be expressed as a natural log function. The inputs to the log function will 
be in the form of a relative portfolio return for a discrete set of returns from backtesting. 
Each return will be weighted by its fraction of the entire portfolio. For an all-cash port- 
folio with no leverage, all of the weights should sum to 1, if the cash position is taken as 
one of the weights. 

For portfolios that employ leverage, the sum of the weights may be greater than 1. 
For example, a portfolio that is leveraged 2:1 would normally have weights that sum to 
2. Some hedge funds use short-selling techniques to reduce risk. In this case, the weight 
for the short positions will be negative. The sum of the weights for the entire portfolio 
could be anything. In particular, it could be negative. 


RELATIVE RETURNS: 


We shall define a raw relative return as follows: 


P/ PA 


Maximal Compounded Return Model 69 


where 
P, =Price at time ¢ 


Thus, the raw relative return is simply the ratio of today’s price to the price one time 
period before. Note that this definition is of a raw relative return. There is no consider- 
ation for dividends, stock splits, rights, or any of the other myriad distributions to share 
holders. 

Accordingly, we must modify the previous definition to take into account such real- 
world oddities as dividends, stock splits, and many other types of corporate distributions. 
According to one study, there have been over 300 different types of corporate distribu- 
tions throughout history. This number will only continue to grow as corporations invent 
new and exotic ways to distribute wealth to their shareholders. We shall combine all 
such distributions into an adjustment factor called D. This represents the value of the 
dividend or distribution during period |. The formula becomes: 


Relative return = (P; + D)/P.-1 6.1) 


All of the inputs to the natural log function shall be in the form of relative returns. 
Relative returns give us numeric values that are centered on 1. When the price is un- 
changed, the relative return will be 1 and In(1) is zero. Relative returns below 1 corre- 
spond to when the price declines, and relative returns above 1 will occur when the price 
rises. We also note that the relative retwmn concept generalizes quite naturally to differ- 
ent time scales than 1 day, simply by changing the time scale. So time frames of a week, 
month, quarter, or even a year are all subsumed under this framework. 

Normally, the price data would come from actual empirical price history. It could 
be all such prices for general portfolio work, or it could be selected trades from a me- 
chanical system of some kind. It is theoretically possible to assume a random known 
distribution of some kind as well. In that case the data would be random realizations 
from the putative distribution. However, the data points are obtained, the formulas 
in this chapter are designed for discrete data, but that should not limit their general 
application. 

Traditional portfolio modeling has assumed a theoretical distribution, but the orien- 
tation of this book is more toward an empirical approach. There are several reasons for 
this approach: 


1. The empirical distribution is a more general, assumption-free framework. It is 
less prone to be dramatically incorrect. 

2. Considerable evidence has accumulated to the effect that the variance of the mar- 
ket exhibits bursts of serially correlated volatility. The assumption of a stationary 


70 OPTIMAL PORTFOLIO MODELING 


variance seems to be incorrect. Thus, resorting to the empirical reduces the depen- 
dence of the model on the assumption of a stationary variance 

3. Formerly, the problem of running large simulations on empirical data was a 
nearly intractable problem for all but the most powerful computers. In the present 
era of fast PCs with built-in hardware, floating-point coprocessors, and large mem- 
ory, this capability is well within the reach of every desktop computer. What was 
once insurmountable is now commonplace. 

4. A certain amount of serial nonlinear relationships exists in the markets. If and to 
the extent that this is true, then static models based on mean, variance, and covari- 
ance do not provide a complete basis upon which to model a porifolio. In contrast, 
using the empirical distribution appropriately can capture this nonlinearity. Conceiv- 
ably, this can even be extended to intermarket relationships as well. To the extent 
that markets are interrelated or even cointegrated, then we need to resort to the 
empirical distribution for our analysis. 


AVERAGE STOCK RETURNS, BLT COMPOUND 
PORTFOLIO RETURNS 


Suppose we had a three-stock portfolio. Our stocks are called A, B, and C, and their 
returns for the period are given by a, 6 and ¢. If we then invest in each in the ratios 
25:25:50, the formula to calculate our return at the end of the time period will be: 


One period portfolio return = .25a + .25b + .50¢ 


It is a simple weighted sum of the returns for each stock. If the weightings had 
been equal, then the formula would be equivalent to a simple average of the returns 
for each stock. The key point here is that returns within a period should be calculated as 
a weighted average of the individual returns. There is no compounding occurring within 
the period because money used to invest in one position is not available for another. 
Thus, there are no logs or other sophisticated formulas. 

If we have n stocks in our portfolio and the retums are given by x1, 2, .. 
the formula for the total portfolio return for the period is 


Jp, then 


a 
Portfolio return = )> wea 6.2) 


1 


The weight of each stock position i is represented by the term w;. This formula is the 
general formula for the weighted average of simultaneous positions held during a single 


Maximal Compounded Return Model ra 


time period. We note that the idea of a time period could be a year, month, week, day, or 
even higher intraday frequencies. For our purposes, the time frame is simply an arbitrary 
planning horizon. It can be taken as long or short as is reasonable for the given portfolio 
and trading style of the portfolio manager. 


If single-period portfolio returns are found as a weighted average of individual returns, 
we may reasonably ask how to define the long-term compounded growth potential of 
a portfolio. As discussed in the previous section, the single period return is a simple 
weighted average. However, we know that the long-term compounded return is essen- 
tially an exponential growth function. Naturally, the usual method to compute this will 
involve logarithms. The resolution to this seeming conflict is to combine both ideas into 
one formula. 

Specifically, we combine the simple weighted average into a single variable for pe- 
riod ¢ as follows: 


a 
Portfolio return for period t= 7 = > wim; (6.3) 
i=l 


Then the contribution to overall long-term compounded return will be a function of 


Contribution to compounded return = In(1 + 7) 


The actual returnis : e+) (5.4) 


Therefore, to maximize the compounded return we only need to maximize the In 
term. Switching to exp( ) notation, for m periods we get 


m 
Multiperiod compounded return = exp (Sma + »») (6.5) 


=1 


In the preceding discussion, we dealt with how to calculate compounded return. The 
method combines both the arithmetic average of the individual positions within period 
and the compounded or geometric average for a sequence of portfolio returns over sev- 
eral periods. 


7 OPTIMAL PORTFOLIO MODELING 


However, it begs the question as to how to control risk. The method only addresses 
how to calculate return. The only inputs to the return function are the weights chosen 
by the portfolio manager. In effect, we are left with the rather obvious answer that the 
primary way a portfolio manager can control risk is via the selected weights. 


CONTROLLING RISK THROUGH OPTIMAL POSITION 
SIZING 


The contribution to overall compounded return for a portfolio of only one position with 
weight w is given by 


n 
Yond + wr) (6.6) 
(=I 

where w is the respective weighting for the position. And each 7 is the return from an 

historical simulation or study of previous returns. To convert this contribution to abso- 

lute dollars it only requires taking the exp() function of equation 5.6. To find the optimal 

position size, one needs only to optimize w with respect to the given historical data. 

A few caveats are in order, however. First, equation 5.6 deals only with a single posi- 
tion. Effectively the one position comprises the entire portfolio. As such, it is appropriate 
only for that case. In the event of more than one position, the formula only helps one 
identify an upper bound for the position size. In other words optimizing for w in equa- 
tion 5.6 only identifies the largest amount one should invest in the given position. The 
true optimal position size in a multi-investment portfolio is likely to be considerably less. 

Under no circumstances should an individual position ever exceed the upper bound 
given by this formula. Traders often term this situation overtrading. In essence, what an 
investor is doing when the position size exceeds this limit is to actually reduce return 
and increase risk. Literally, it is the worst of both worlds. 

It is simply not rational to exceed this value in a single position. To do so exposes 
the portfolio to added risk without compensating return. It is nothing more or less than 
gambling. However, it is one of the most perfidious forms of gambling in the sense that 
exceeding the upper bound on position also exposes the portfolio and the manger to total 
risk of ruin. Clearly this is not a desirable situation. 


MAXIMIZE COMPOUNDED PORTFOLIO RETURN 


The more general case for portfolio returns is the case in which multiple investments are 
held at the same time. In order to deal with this case, we shall need to consider both the 
average return within a period and the effect on sequential compounding that results. 


Maximal Compounded Return Model 73 


But the overall goal shall be to optimize the portfolio allocation so that the maximum 
compounded return is achieved. 

Assuming weights w 1, w2,..., Ww», Where n is the number of positions, and expected 
return vector 7 where each 7; is a return for the given security in that period. The word 
vector here simply means a list of returns. Note that 7 is a vector containing observations 
for the same period. 

Nominally, it would come from a historical study of past prices. However, for our 
purposes it could also come from a lognormal simulation that takes appropriate mea- 
sures to account for correlation between assets. Because the correlations between finan- 
cial assets tend to be high, it is important to view this return vector as returns from the 
same period. This treatment effectively captures the dependency caused by correlation. 

We only need to set up and solve the optimization problem. In keeping with the phi- 
losophy of this book, we shall let the computer do the actual optimization. The objective 
function is taken first from equation 5.3, repeated as equation 5.7, which gives us the 
simple arithmetic average retwm for the period. 


Portfolio return for period t = 


Lv 61 


™ 
Multiperiod compounded return = exp (= In +) (6.8) 


where m is the number of time periods. 


In order to maximize the long-term multiperiod return of a portfolio, we must maximize 
the return expressed in equation 5.7. For now, we shall not consider how to maximize. 
Rather, the important first goal is to further consider what to maximize. Equation 5.7 
gives us the long-term formula, but it is important to realize that the overall objective is 
reached by a series of steps. Each period represents a step in the long-term process. 

Compound interest is a multiplicative process. It is not an additive process. In con- 
trast, the average return within period of a portfolio is an additive process. We add up 
the weighted returns and compute the average. To reflect the fact that the long-term pro- 
cess is multiplicative, at each step we should seek to maximize the log of the portfolio’s 
contribution. Thus, the appropriate formula to optimize is 


Max In(1 +7) (69) 


7A OPTIMAL PORTFOLIO MODELING 


This simply represents the inner part of equation 5.8. This is the part that represents 
the within-period returns. So when we maximize this formula, we maximize the ongoing 
multiperiod returns. 

The best way to view such formulas is as a maximal investment size formula. It is the 
amount to invest if one had no aversion to risk at all. The person or portfolio manager 
who is risk indifferent would seek to maximize this formula. Essentially, this is a mathe- 
matical statement of how to maximize one's long-term compounded wealth if one is risk 
neutral. 


WHAT THE MODEL IS AND IS NOT 


The important thing to understand about equation 5.9 is that it maximizes the long-term 
compounded return of the portfolio. In one sense, that is what we want. Clearly, max- 
imizing return is a very important part of investing. The drawback is that maximizing 
return is merely that. It does not fully address the issue of risk. Many investors and port- 
folio managers dwell exclusively on retum to the exclusion of all else. A problem lies 
therein. 

When one exclusively focuses on return to the exclusion of all considerations of 
risk, it creates a situation in which volatility is ignored. In particular, for the case of 
one investment per period, using equation 5.9 to optimize return results in an ultimate 
volatility of returns that is unacceptable to most investors. 

An examination of the optimization function shows that it gradually rises up to the 
maximum point but then suddenly plummets after the maximum value is reached, The 
Jesson to take away is that missing the maximum point on the low side has little cost. 
But an error in finding the optimum value beyond the actual maximum has dire conse- 
quences. It is an example of an asymmetric risk-reward function (see Figure 5.1). 

In Figure 5.1 we have a typical example of a Maximal Retum function. The sample 
returns used to construct this were simply made up from typical return numbers. In 
normal modeling, these returns would come from historical backtesting or from actual 
trading results in the past. 

We see in Figure 5.1 that the return starts out positive for small values of the weight- 
ing function. In this fictitious example, it reaches its maximum value at 59 percent of the 
portfolio invested in the market. 

One salient feature of this graph is that so much of the right side of the function 
actually leads to large losses. The given set of returns has a positive expectation in the 
usual misguided statistical sense so that is not the problem. The real issue is that as 
more money is invested, the compounded returns increase at a slower and slower rate. 
Ultimately, at about 59 percent invested, the returns actually start to diminish at an ever 
faster pace, 


Maximal Compounded Return Model ris) 


0.000 
© 
2 
ta 
S 0.005 4 
3 
g 
2 
3 
E 
8 -0010 4 sn 
‘, 
-0.015 4 * 
T 1 T T T 
0.0 0.5 10 15 2.0 


Position Size 


Figure 3.1 Asymmetric Risk-Reward Function 


The weightings on the right side greater than 1.0, represent the use of leverage. At 
1.0, the investor is fully invested. Whereas at 2.0, the investor is leveraged 2:1, which 
corresponds to using 50 percent margin. Clearly, the excessive use of leverage in this 
situation can be catastrophic. 


In order to model the empirical distribution, as opposed to the theoretical, we must cre- 
ate a random reordering of the historical record. The key consideration is to preserve 
any sequential correlations in the data. The good news is that there generally are few 
linear correlations in the data. That fact is generally promised by the efficient market 
hypothesis. However, there may be relationships in the data that are nonlinear. Preserv- 
ing these is the essence and the art of good portfolio modeling. 

To model empirical data, yet preserve nonlinear relationships at various lags, we 
must preserve both the data and most of the lag relationships. By simply randomizing the 
start time and adopting a relatively long window until the end time, we can effectively 
preserve any nonlinear relationships in the data. 

The primary considerations when modeling with a randomized time model are to de- 
cide on the size of the data window and the number of randomizations to be performed. 
With respect to the window, it should fit the desired holding period or it should be 


76 OPTIMAL PORTFOLIO MODELING 


compatible with the desired reevaluation period. For example, if we want to evaluate our 
portfolio on a monthly basis, then a look-back window on the order of a month would be 
appropriate. 

Another consideration is if some past data is needed. For some trading systems, a 
certain amount of data from a prior period is needed. For example, if we need to calculate 
a moving average of 20 days, then we will need to allow for at least 20 extra days in 
our data prior to the random time chosen. Thus, if we randomly chose time ¢, then the 
data window must include day ¢ — 20 through day ¢ — 1 in order to allow for the moving 
average. 

In addition, we must allow for the actual window of data we wish to analyze. Thus, 
our total required data for this case would be the window plus 20. Although it is certainly 
possible to ignore this issue initially, in the opinion of the author it is better to address 
this in the planning stages of any model. The alternative—to allow the computer model 
to blow up with an error message—generally occurs at the most inopportune times. Ig- 
noring this issue can lead to unknown errors that the computer does not detect. For 
example, it could result in data values of zero inadvertently being used in place of real 
data, or any of a myriad number of other problems. 


CORRELATIONS 


Correlations are handled as an explicit part of traditional portfolio theory. In that field, 
they are crucial to the calculations and need to be handled directly. However, if we use 
the empirical distribution, relationships may be embedded in the data possibly in a non 
linear way. It is important to sample all of the data points from the same time period 
for all of the investments in that period. Only in that way can we capture the actual 
underlying relationships. 

One example of this might be if we were analyzing a macroportfolio consisting of 
stocks, bonds, gold, and the dollar. We might well find that stocks and bonds are posi- 
tively correlated at times and at other times they are negatively correlated. So, too, we 
might find that the correlation between the bonds and the dollar waxes and wanes at 
certain times in the business cycle. 

The only effective way to capture these relationships is to pick our sample from a 
randomly selected window in time that includes all of the relevant data points varying 
together as they varied then. We need to resample the entire multivariate distribution, not 
just one isolated variable at a time. In other words, each sample window in the preceding 
example would have to include data for stocks, bonds, gold, and the dollar for the entire 
window. 

As an example, we would sample a window beginning at randomly selected time t. 
To capture all of the hidden relationships, we need to consider all of the data for stocks, 


Maximal Compounded Return Model Fil 


bonds, gold, and the dollar at time t as they occurred at that time. We then walk forward 
through the entire sample window taking all the observations for that time period. This 
technique captures both the contemporaneous correlations that occur within a single 
period as well as any sequential relationships that may exist. 

We also note that this sampling technique preserves not only linear correlations but 
also other nonlinear relationships. Looking at correlation coefficients is usually sufficient 
to detect linear correlations, but it is no assurance that there are not nonlinear effects 
present. 

It has often been noted that during financial crises all markets tend to be correlated. 
This includes markets that were not previously related, at least not linearly. Using the 
window sampling technique helps to preserve this structure in the empirical data. 


Earlier in this chapter, we 2d optimization of the single-period portfolio returns, 
and specifically for a single investment. Although this may seem risky or foolhardy at 
first blush, in fact sometimes it is quite reasonable. For example, suppose an investor 
is invested only in the Standard & Poor’s 500 index fund. These go by the moniker the 
Spyders with ticker symbol SPY. Effectively, this fund is a diversified portfolio of some 
500 large-cap stocks weighted by capitalization. No one could argue that the portfolio is 
not sufficiently diversified. Naturally, this generalizes to any index fund or future product 
with similar properties. 

Therefore, it is actually reasonable to consider investing in what amounts to a single 
entity. Thus, the discussion of the previous section is quite reasonable under certain 
circumstances. 

However, we can also consider the case of multiple positions. In this case, the same 
basic formula applies, but we note that the formula to calculate 7, now must include all 
of the portfolio positions. We then have: 


n= > wie (6.10) 


Where the w; is the weight given to investment ¢ and 2; is the return from that invest- 
ment during period f. Now 7 is the weighted portfolio level return for period ¢ given the 
selected weights for each position. 

At this point it only remains for us to optimize the overall return by proper selection 
of the weights, all the w; choices. So we let the computer optimize by maximizing the 
following objective function over all the weights. 


Max In(1 +7) (6.11) 


TURUN KAUPPAKORKEAKOU 
KIRJASTO - TETOPALVELN. 
REHTORINPELLONKATU @ 
20500 TURKU 


rid OPTIMAL PORTFOLIO MODELING 


EXPECTED DRAWDOWNS MAY BE LARGE 


The major drawback to using the return optimizing formulas is that the variability is 
likely to be too great. The formulas do maximize long-term compounded return, just as 
advertised. However, because risk is not included explicitly, they are subject to wild 
swings and potentially catastrophic drawdowns. 

Earlier, we cautioned against the folly of investing beyond the recommended opti- 
mum. In fact, returns fall off dramatically. Overinvesting can actually cause a negative 
expected return in a logarithmic sense. At the same time, risk as measured by the vari- 
ance increases. It is the worst of both worlds—lower return and increased risk. Some 
writers who have experimented with similar formulas have suggested that the maximal 
investment be reduced from its existing level to a level one fourth of the calculated size. 

One advocate of the one-fourth heuristic is Ed Thorpe, former M.LT. professor of 
statistics. He is, perhaps, better known as the author of the best-selling book Beat the 
Dealer, The book chronicles his successful attempt to beat the casinos at the game of 
blackjack. Thorpe’s one-fourth suggestion is merely a heuristic designed to produce a 
more palatable money-management formula. However, it leans in the right direction. 
More importantly, it demonstrates how the blind use of the maximal optimization for- 
mula leads to results that many find unacceptable. 


CHAPTER 


“Be careful what you wish for, you might get it.” 
—Anonymous 


The word utility essentially means “usefulness.” If the goal of investing is to create 
and enhance wealth, then we need to answer the perplexing question of how to 
value wealth. Here we address the broad subject of how useful is a dollar. 

First, we introduce the idea of a utility model on the simple grounds that we must 
define what it is that we want before we can devise a plan to go after it. The utility model 
becomes our goal. We must come to grips with our desired rate of return and our feelings 
toward risk. 

The chapter covers the history of logarithms and relates some of the theories that 
have used logs to model investments. An overview is given of some of the great thinkers 
and theories of the utility of money. The discussion ranges from the St. Petersburg Para- 
dox to the relatively modern work in game theory and utility models such as Arrow’s 
work. 

The focus is kept on the salient properties of a good utility model. This concept is 
developed into a utility model that will define and is consistent with optimal long term. 
growth of capital at the portfolio level. 

The text also discusses the Sharpe Ratio, an investment metric that seems to be 
ubiquitous in the industry. A model is developed that will optimize the Sharpe Ratio for 
a given portfolio. 

There is a general discussion of the many optimization routines available in the R 
library. Typically, each routine can be invoked simply by a single line command. The 


Ts purpose of this chapter is to discuss the all-important subject of utility models. 


79 


380 OPTIMAL PORTFOLIO MODELING 


section provides an overview of the most useful commands and a brief description of the 
situations in which each one applies. The reader will find information on how to obtain 
full detailed online help in that section as well. 

The Excel add-in package Solver is used as an example of how to set up a spread- 
shect to optimize the Sharpe Ratio. The discussion in the text relates to the example 
provided on the CD-ROM that accompanies this book. As always, emphasis is on the 
concepts and understanding how to interpret the results. The Solver package already 
knows how to do the optimization so that you do not have to. 

Optimization examples are provided on the CD-ROM in both Excel and R. Readers 
are encouraged to review these after or during their reading of the chapter. 


BASIS FOR A UTILITY MODEL 


The anonymous aphorism quoted at the beginning of this chapter illustrates the dilemma 
that investors face. In order to make an intelligent investment choice, they must first 
define what it is that they actually want. Defining the investment goal is the first step in 
achieving investment success. However, it is equally important to understand the conse- 
quences that are attendant on the investment goal being considered. 

For example, most people know that they want to maximize return on investment. 
But with that usually comes added risk. So to solely focus on return is certainly faulty 
and likely will lead to undesirable consequences. Some consideration of risk must be 
included, Additionally, we can and should consider other investment measures, such as 
probability of gain, compounded portfolio return, skew of the resulting distribution, and 
perhaps even the kurtosis of our chosen investment goal. 

For the remainder of this chapter, we shall use the term utility function for our 
investment goal and discuss how to derive a suitable utility function. 

Clearly, any reasonable utility function should include the average portfolio return, 
as discussed in previous chapters. This then forms the basis for calculating the com- 
pounded portfolio return that should also certainly be included in any goal. 

Because very few of us are risk averse, we should also include a reasonable measure 
of risk in our utility function. Risk has always been a rather ephemeral concept in finance. 
Some would say it is only the amount lost that should be considered in risk. Others argue 
that all of the volatility should be considered, with the largest outliers weighted the most. 

Effectively, this is what the standard deviation function of statistics does. It weights 
the resulting deviations from the mean by their squares and has been shown to achieve a 
maximum likelihood with its weighting. Largely as a result of this, the field of finance has 
settled on the standard deviation as the most accepted measure of variability of return. 
It is effectively equated with risk. We shall adopt this standard as well. 

It is customary in the U.S. investment world to measure everything in dollars. Even 
if one’s currency is euros or yen, there is a simple conversion at any given time that 


Utility Models— Preferences Toward Risk and Return an 


will convert to the desired numeraire. But is a dollar really worth a dollar? Earlier, we 
discussed the fact that most people would not play the St. Petersburg Paradox game at 
its stated mathematical expected value nor even anything close to that, value. As Daniel 
Bernoulli argued, the reason was that people actually had a logarithmic sense of the value 
of money. Thus, it may be reasonable to impute a utility function based on the natural 
log of the dollars in question. 

It should be emphasized that this is different from, and in addition to, the use of loga- 
rithms that we employed previously in maximizing the compounded return of the portfo- 
lio. That return is still expressed in dollars. However, those dollars require an additional 
log function to be converted to utility units. Thus, one of the fundamental assertions of 
this book is that the utility function of choice should include an explicit In In function or 
iterated log function. 

Undoubtedly, many of the readers of this book are professionals employed in the in- 
vestment field. For those people, there is another special kind of utility function that must 
be acknowledged. Specifically, it is the metric by which they are measured as managers. 
Tn modern portfolio management, the metrics have grown ever more sophisticated and 
typically do include some sort of explicit recognition of both risk and return, as well as 
other measures, In particular, the Sharpe Ratio has become nearly ubiquitous. The popu- 
lar Morningstar mutual fund tracking service routinely calculates a Sharpe Ratio for each 
of the funds tracked. In addition, other measures of performance are widespread. 

The famous beta calculation that measures the sensitivity of a portfolio to market. 
moves in the designated index is widely available. Many clients wish to reduce their ex- 
posure to equities to a particular level and thus, wish to minimize the overall systematic 
correlation with equities. 

For each security or individual investment, there is a beta that measures its intrin- 
sic systematic relationship with the overall market. The amount of variance explained 
by that beta regression is the R? of that regression. This is separate and distinct from 
the total variance that measures the total variability of the investment. Thus, in defining 
one’s utility function, the beta represents the market correlation that cannot be elimi- 
nated through simple diversification. In a similar fashion, the R? measures the amount of 
variance that cannot be eliminated through diversification. 


HISTORY OF LOGARITHMS 


No discussion of the history of logarithms would be complete without a reference to John 
Napier. In many ways, he can be considered the father of the logarithm. In 1614, Napier 
published a book titled Mirijici Logarithmorum Canonis Descriptio. It represents the 
first publication of the logarithm concept. 

Clearly, the concept of logarithms or the logical prerequisites for its conceptual gene- 
sis were extant at the time. In particular, the Swiss mathematician Joost Burgi published 


82 OPTIMAL PORTFOLIO MODELING 


his technique for logarithms in 1620 just six years after Napier. For perspective, this 
is the same year that the Pilgrims landed at Plymouth Rock. Burgi may have privately 
developed his technique as early as 1588 but was moved to publish only after prompting 
from Johannes Kepler. Although Napier’s publication was already well known at the time 
throughout Europe, the method used by Burgi was quite distinct from Napier’s, and thus, 
clearly had been developed independently. For his contribution, the lunar crater Burgius 
has been named in his honor. 

Given today’s electronic technology, few realize that two rulers can be used as a 
simple adding machine. The idea is very simple. If you want to add 2 + 3, all that is 
required is to line up the rulers and then slide one of them to the right so that its zero 
point lines up with the numeral 2 on the other one. Then look to the right on the one that 
was slid to find the 3. The numeral 3 should now line up with the numeral 5 (the answer) 
on the other ruler. Essentially, the idea is to add the length 2 to the length 3 and we find 
that the total length is 5. 

Napier used the same basic idea to construct a simple device that acquired the 
moniker Napier’s bones. The key idea was to construct the two rulers on a logarithmic 
scale rather than an arithmetic scale as on a standard ruler. Thus, when the top ruler was 
slid to the right, the device was now adding the logarithms, which is the equivalent of 
multiplying. The answer could be read directly from the opposite ruler because the scale 
was in logarithms as well. Thus, Napier invented the slide rule, which greatly simplified 
multiplication and division right up until the time that modern calculators were invented. 

In 1728, the Swiss mathematician Gabriel Cramer wrote a letter to his friend Daniel 
Bernoulli. In the letter, Cramer outlined his thoughts on the St. Petersburg Paradox and 
how to resolve its paradoxical nature. Cramer's contribution was to explicitly include 
a utility function as part of his analysis. His proposed utility function was based on the 
square root of the amount in question. Cramer reasoned as follows: 


Mathematicians estimate money in proportion to its quantity, and men of 
good sense in proportion to the usage that they may make of it. 


Thus, he conjectured that the value of money did not rise in linear proportion to the 
arithmetic amount, but rather, rose monotonically but at a decreasing rate as the amount 
of wealth increased. He posited an absolute upper bound of 2™ coins as the upper limit 
of utility. He was also the first to propose the St. Petersburg Paradox in its modern form 
with coin flipping instead of a six-sided die. 

Although Cramer’s choice of the square root was not quite correct, it was his think- 
ing that led Bernoulli to propose the natural log function as the basis for the utility of 
money. Bernoulli added the concept that one’s wealth must be a factor in the concept of 
utility. Cramer had missed this latter point completely, presumably seeking a pure, more 
universal definition of utility. 


Utility Models— Preferences Toward Risk and Return 83 


Unfortunately, Bernoulli’s utility function and the St. Petersburg puzzle were effec- 
tively lost to modern thinkers. It was only in the 1950s that the paper was rediscovered 
and translated for modern scholars. 

In late 1944, Jon Von Neumann and Oscar Morgenstern published their ground- 
breaking work, Theory of Games and Economic Behavior. Their contribution was to 
axiomatize the idea of a utility function as a monotonically increasing concave function 
of wealth. Monotonically increasing can be interpreted as “more is better.” The con- 
cavity property means that investors are increasingly sated with greater wealth. In other 
words, a dollar more to a millionaire is not worth as much as a dollar to a pauper. The 
value of a dollar increases, but at a marginally declining rate. 

K. Arrow, G. Debreu, and J. Pratt all made important enhancements to the devel- 
opment and further rigorous development of utility theory. The important take away 
from these ideas is that a proper investment utility function is monotonically increasing 
(everywhere of interest) and concave. The concavity can also be stated as saying that the 
second derivative is everywhere negative. Importantly, the log function family satisfies 
these requirements (see Figure 6.1). 


-34 


Figure 6.1 Log Function 


84 OPTIMAL PORTFOLIO MODELING 


More recently, Professor Mark Rubinstein of the University of California, Berkeley, 
has published a paper discussing the implications of an investment policy that solely 
maximizes growth. The paper offers a straightforward proof of the proposition that op- 
timizing the expected compound growth is the long-run optimal strategy. In addition, he 
addressed the question as to how long it would take for an investor to be 95 percent 
confident that his or her optimal growth maximizing policy in stocks would be superior 
to other more mundane strategies, such as investing in fixed short-term money market 
rates. 

The resulting time frames were surprisingly long using reasonable assumptions from 
the market's own historic rates of return and variability. For example, to achieve a 95 
percent confidence that the optimal strategy would be superior, the investor would have 
to wait more than 200 years. For 99 percent assurance, the wait time is about 4,000 years! 
Reassuringly, the probability that the strategy will be superior eventually approaches 
1. However, it takes much longer than 4,000 years. This raises the obvious rhetorical 
question—can an investor wait so long? 

Clearly the log-maximizing strategy is optimal in an expected log sense. However, it 
may not necessarily be the most desirable. Rubinstein also adds the point that it does 
not necessarily maximize the probability of being the best strategy at all intermediate 
time frames. So if maximizing probability of meeting a certain benchmark return is the 
goal, another utility function may be in order. Specifically framing the utility in terms 
of the probability of success at a given point in time might. be one reasonable objective 
function. 


OPTIMAL COMPOUNDED UTILITY MODEL 


We are now ready to proceed with the development of a formal model based on logarith- 
mic utility. The key is to define our utility function that will explain what a dollar is worth 
to us. Clearly, by historical precedent and sound logic, the natural log is a clear choice as 
our utility of wealth number. 

We also wish to maximize the overall compounded portfolio return as measured in 
utils. Instead of maximizing wealth directly, we now seek to maximize utility as mea- 
sured by utils. We shall use the term utils as an arbitrary unit of measure to emphasize 
that we are not speaking of dollars but rather in utility units. From our earlier discussion, 
the In function is the natural choice. So our utility function for wealth at time t becomes 


U;, (wit) = In / Wea) (6.1) 


where W,_; is the wealth or portfolio value from the previous period ¢ — 1 and 7; is the 
portfolio return for the period from Ea. 5.3. 


Utility Models— Preferences Toward Risk and Return 85 


Then the objective function is to maximize the compounded utility. That is, we wish 
to maximize compounded U/() at time t. But the contribution to compounding for each 
U(#) is nC + U,). Substituting from equation 6.1, our objective function becomes 


Objective = max Yana +U,)) (6.2) 


We note in passing the use of the iterated logarithm notation. Intuitively, this is mo- 
tivated by the fact that the utility is a log function and the contribution to compounding 
of the utility is also a log function. Hence, the ultimate objective function is in the form 
of the log of a log. 


The performance of many professional money managers is measured by the Sharpe Ratio 
possibly along with other metrics: 


Sharpe Ratio = (%» — 1%)/Sp (6.3) 


where 
vp is return for the portfolio 
ry is the risk-free rate of return (usually three-month T-bills) 
Sp isthe standard deviation of the portfolio returns 


Essentially, the numerator measures the excess return the portfolio received over 
and above the riskless rate. The excess return is then divided by the number of units 
of risk as measured by the standard deviation. This gives us the excess return received 
per unit of additional risk taken. This seems to be a reasonable measure of portfolio 
performance. 

In any event, this is the measure of a portfolio manager’s performance that is most 
prevalent in the industry. Everyone who manages money should be aware of what it is, 
how to calculate it, and what it measures. Any manager who is routinely measured by 
this yardstick may wish to consider adopting the Sharpe Ratio as the portfolio objective. 
In so doing, the money manager will be adopting a reasonable objective function as the 
portfolio goal and will be simultaneously optimizing personal performance as measured 
by the Sharpe Ratio metric. 


In the preceding section, we discussed what the Sharpe Ratio is and what it purports to 
measure. It is an important measure of portfolio performance and is widely used in the 


86 OPTIMAL PORTFOLIO MODELING 


industry. Many managers are judged by it. It is probably fair to say that most are at least 
partly measured by the Sharpe Ratio. 

Thus, it is important to know how to optimize this ratio and how to target one’s 
portfolio to achieving the best possible Sharpe Ratio. That is the goal of this section. More 
generally, this section can serve as a case study of how to perform portfolio optimization 
for any desired objective function. 

One of the goals of this book is to elucidate the concepts of portfolio modeling in 
an intuitive and easily accessible manner with a minimum of math. In keeping with this 
objective, we shall focus on how to calculate the answer for any given optimization prob- 
lem, We will eschew all attempts to derive theoretically elegant optimal solutions and 
support those formulas with proofs. Instead, the focus shall be on obtaining the right an- 
swer with a minimum of effort. To a practitioner, this simply means we need to find out 
how large each position should be in order to optimize our objective function. Implicit 
in this is the transcending question of the optimal amount of leverage or cash that one 
should employ. 

Fortunately, there are at least two readily available solutions: 


1. Microsoft Bxcel has an excellent add-in package called Solver that will perform the 
necessary optimization. To use Solver, first define the objective function as a single 
cell in the spreadsheet. For the Sharpe Ratio, the formula would be based on the 
following (patterned after equation 6.3): 


Sharpe Ratio = (%p — 11)/Sp 


Thus, one only need define the portfolio return rp, risk-free return 7; and portfolio 
standard deviation s, based on the weightings and some past data for the positions 
being considered. The Solver program will need to know where the weightings are 
to be placed. Often, it is helpful to start with some reasonable initial values. Good 
choices would be equal weighting or capitalization weighting. If we select a reason- 
able choice such as these, Solver is able to run a little more quickly. Essentially, 
Solver will find the weightings that represent the “answer” as far as a practitioner is 
concermed. The weightings are the amounts one should invest in order to optimize 
the Sharpe Ratio. 

2. The other way to optimize the objective function is to use the statistical lan- 
guage R (or its close cousin S) as the optimization tool. R will generally be faster 
than Excel, and it will often be easier to implement than the spreadsheet approach. 
This latter point is especially relevant as the number of desired portfolio positions 
increase. 


Utility Models— Preferences Toward Risk and Return 87 


The optim routine in the R stats library will generally suffice for most purposes. 
The calling sequence for this function is as follows: 


optim(par,fn,gr=NULL, 
method=c(“Nelder-Mead”, 
“BFGS”, “CG”, “L-BFGS-B’, 


“SANN”), 
lower=-Inf,upper=Inf, 
control=list(), hessian= 

FALSE, ...) 


For our purposes, the parameters are as follows: 


par Initial values for the various method The method to be used. For our 
parameters (weights) to be optimized. purposes the default method is fine. 
fn The objective or goal function to be Type help(optim) and see Details. 
minimized (or maximized). The first lower, upper Bounds on the variables 
argument is the vector of parameters for the “L-BFcs-B” method. 
over which minimization is to take control A list of control parameters. See 
place. It should return a single Details. 
valued scalar result. hessian Logical. Should a numerically 
gr A function to return the gradient for differentiated Hessian matrix be 
the “ares”, “cc” and “L-BFGs-B” returned? Most users will set this 
methods. For our purposes, set this to false or omit this variable. 


value to NULL, and a finite-difference 
approximation will be used. 


One tip that is well worth knowing is that any maximization problem can easily be 
changed into a minimization problem, and vice versa. To accomplish this feat, one need 
only place an artificial minus sign in front of the function to be minimized. It will then 
become a function to be maximized. If the function is one we wish to maximize, such as 
the Sharpe Ratio, placing the minus sign in front will allow us to use a software routine 
that minimizes the function to find the desired answer. 

For our purposes, it is best to present the problem as a minimization problem in 
contrast to the fact that we usually wish to maximize the Sharpe Ratio. The par vector 
should be a simple list of starting values for the weightings of each investment in the 
portfolio model and one for cash. Setting the gr or gradient parameter to NULL allows 
the routine to calculate its own finite difference derivatives rather than requiring you as 
the user to derive and provide a matrix of derivative functions. Again, the emphasis is on 
simplicity. The optim routine will work fine without the added help from the user. 


88 OPTIMAL PORTFOLIO MODELING 


For more details on this routine, type help(optim) in the R environment to view the 
complete documentation. 

For those users who are more venturesome, another optimization routine worth ex- 
ploring is nlminb. Again, more information can be obtained by typing help(nIminb). 

Some other routines that can be of interest in R and its sister language S are the 
following: 


* Optimize 

e Uniroot 

e Univariate 

e Glm 

e Ms 

e Nim 

e Quadprog—for traditional Sharpe Markowitz quadratic programming 


Each of these routines differs somewhat in their purpose and input parameters. The 
first two routines only deal with univariate optimization. In other words, they are de- 
signed for optimizing a function of only one variable. For the general case, we are deal- 
ing with functions of as many variables as we have candidate portfolio positions. The 
weighting of each position is the variable that we seek. Thus, these first two routines 
are designed for the special case in which only a single trading vehicle is being con- 
sidered, such as the SPY exchange traded fund or the S&P futures contracts. In this 
case, the variable that is optimized is the amount of cash or leverage that, should be 
employed. 

The R environment provides excellent help support that includes all the details 
on the calling parameters and details on the options available in each routine. The 
complete help description and parameters can be called up simply by typing help 
(functionName). 


OPTIMIZATION WITH EXCEL SOLVER 


Suppose we wish to optimize a simple portfolio consisting of stocks in the form of the 
SPY ETF, bonds in the form of the TLT bond ETF and risk-free cash via short-term Trea- 
sury bills. A spreadsheet to optimize the Sharpe Ratio for such a portfolio is shown as 
Figure 6.2. The actual spreadsheet program is included on the CD that accompanies this 
book. Interested readers are invited to review the simple formulas that are in that spread- 
sheet. 


Utility Models— Preferences Toward Risk and Return 389 


Figure 6.2 Sharp Ratio Optimizer 


A few key values in the spread sheet are worthy of note. 


e Cell B2 contains the formula for the Sharpe Ratio. This is our objective function. In 
this case the formula is simply the spreadsheet version of equation 6.3. 

Cells B11 through B17 contain imaginary monthly return data for SPY. 

¢ Cells C11 though C17 contain imaginary monthly return data for TLT. 

e Cells B5 through B7 are assigned arbitrary guesses as to portfolio weightings for the 
three investments: SPY,TLT and cash. 

Cell B8 is the sum of the three weightings and thus adds up to 100 percent. 


The problem assumes that the portfolio is 100 percent invested in one of the three 
assets at all times. Also assumed is that the portfolio is not allowed to use leverage. In 
other words, the total investment can never exceed 100 percent. The final restriction on. 
the portfolio is that it is not allowed to sell short. In other words, position weightings for 
each position have to be zero percent or more. Negative values are not allowed under 
our assumptions. 


90 OPTIMAL PORTFOLIO MODELING 


Figure 6.3. Solver Parameters 


To solve this problem. we seek to maximize the Sharpe Ratio subject to the con- 
straints given above. The answer we are seeking is the amount of the portfolio weightings 
that will maximize the Sharpe Ratio given the data presented, In order to do this, it will 
be necessary to use the Excel Add-In routine called Solver. It should have come standard 
with your software, but may need to be installed if it has not been installed already. 

‘The Excel Add-In Solver can be installed by simply typing Tools/Add-Ins and select- 
ing the Solver Add-In package. You may have already installed the package; if so, then 
that step can be skipped. 

Once installed, Solver can be invoked by clicking Tools/Solver. If Solver has been 
newly installed, it may be necessary to click the down arrow at the bottom of the drop- 
down menu to make it appear for the first time. Once Solver is selected, a pop-up window 
will appear in the form of Figure 6.3. 

Note that the Solver pop-up window has been filled in with some cell values. These 
refer to the cells in the main body of the spreadsheet. In particular, the box labeled Set 
Target Cell has been filled in with B2, our Sharpe Ratio target function. The box titled 
By Changing Cells has the cell range B5:B7 that are our three weightings for the three 
portfolio positions. 

The pop-up window also has another larger box titled Subject to the Constraints. 
‘The four lines in there are the portfolio constraints already discussed. The first three are 
that B5, B6, and B7 cannot be negative. This is the prohibition against short selling. The 


Utility Models— Preferences Toward Risk and Return o1 


final line is B8<=1. Cell BS is the sum of the weightings of all three positions. Because 
leverage or borrowing is not allowed for this example, the sum of the three positions 
cannot exceed 100 percent that is represented by the number one in this case. So the 
sum of the three weightings must be less than or equal to 1. 

We also note that the Equal To section is checked as Max, indicating we would like 
to maximize the target function Sharpe Ratio. Solver is perfectly capable of solving mini- 
mization problems as well. If this had been a problem in which we seek to minimize risk, 
that might be the appropriate choice. 

Normally, the user would fill in these parameters and constraints in order to set up 
the problem. Then, when the pop-up window is filled in, we only need click the Solve 
button to find our solution. Almost instantly (for this problem), a solution will be found 
and another window will ask if we wish to use the Solver solution and to update the 
spreadsheet. Click OK to see the answer. The solution is the new weightings that have 
replaced our original guesses in cells B5 through B7. The new optimized Sharpe Ratio 


Pie ees Oe (ee Pere Eee 
1. Sharpe Ratio Optimizer __ i 


4 |Std Dev 


5 SPY weigh 
6 |TLT weight _ 
7_(Cash weight 


Total Invested 
9 |Riskless rate / mo 


19 | Std Dev 


Figure 6.4 Sharpe Ratio Optimizer 


92 OPTIMAL PORTFOLIO MODELING. 


value is also shown, along with the recalculated average return and standard deviation. 
The recalculated spreadsheet solution is shown in Figure 6.4. 

The reader should now be able to generalize these concepts from this example to 
other portfolio modeling problems simply by using the relevant formulas and concepts 
of this book to set the Solver model. Each problem only needs a target function, the 
constraints, and usually some data. For most of the application covered in this book, the 
weighting values are the desired solution to the problem. 


The safe way to double your money is to fold it over once and put it 
in your pocket. 


—Frank Hubbard 


There is a very easy way to return from a casino with a small for- 
tune: go there with a large one. 
—Jack Yelton 


and in fact, they are highly correlated. During the same day, stocks will often show 

positive correlations on the order of 80 to 100 percent. In a highly correlated en- 
vironment, the usual statistical assumptions of independent identically distributed vari- 
ables go all awry. Nothing is independent, because everything is highly correlated. 

This chapter deals with the issues of many variables that may or may not be cor- 
related. Initially, the discussion starts with the continuous theoretical distributions and 
how correlation plays apart. The role that correlation plays in the maximal log log model 
is further discussed. 

The next section develops the concept of how correlation impacts the Sharpe Ratio. 
From there, the discussion turns to the empirical distribution. The end of the chapter 
addresses the question of using the empirical distribution in the presence of correlation 
in order to develop the maximal log log model for publication, as well as how to analyze 
the empirical definition. 


GC orrelation is an inescapable fact of financial life. Virtually all stocks are correlated, 


93 


94 OPTIMAL PORTFOLIO MODELING 


THE CONTINUOUS THEORETICAL DISTRIBUTIONS 


Previously, we have only slightly considered correlation in the discussion of portfolio 
modeling. In this chapter we consider the very important real-world consideration of 
correlation between investments. This discussion naturally breaks down into two basic 
discussions. 

First, if one wishes to assume one of the continuous theoretical distributions, we can 
use the technique of random sampling from known distributions such as the lognormal. 
Second, if one wishes to avoid any unnecessary assumptions regarding the underlying 
distribution, it is best to use the empirical distribution and perform resampling from that 
to model the portfolios. In either event, the correlation between assets and any other 
known and possibly nonlinear relationships should be modeled explicitly. 

Typically, the correlation between stocks is somewhere between 50 percent and 90 
percent. The correlation between broad-based indices is almost always greater than 
90 percent. Correlations between exchange traded funds (ETFs) can be anywhere from 
50 percent to 90 percent, depending on how broad-based they are. The reality is that fi- 
nancial markets are inescapably correlated and interrelated in many ways. Any practical 
model of portfolio behavior must inevitably take this into account. 

For the theoretical distributions, one way to take into account the correlation be- 
tween assets is to calculate the correlation based on the past correlation relationship. 
For this purpose, the formula for the variance of two correlated assets illustrates the im- 
portance of explicitly incorporating the correlation into our model. In equation 7.1, we 
note that the variance of the joint probability distribution of X and Y increases by twice 
the covariance between the two variables: 


VAR(XY) = VAR(X) + VAR(Y) + 2COV(XY) (a) 


We emphasize that the formula doubles the impact of the covariance between the 
two variables. Thus, a portfolio of positively correlated assets will always have more 
variance than the simple sum of the individual variances of the positions. 

Alternatively, we can randomly resample from the putative theoretical distribution. 
Inso doing, we must explicitly include recognition of the correlation between the assets. 
Equation 7.1 can be quite helpful to model random samples from the joint probability 
distribution of a pair of correlated assets. 


MAXIMAL LOG LOG MODEL IN THE PRESENCE 
OF CORRELATION 


In the previous chapter we discussed the innovative log log model proposed in this 
book. However, the discussion assumed no correlation between assets for simplicity and 


Money Management Formulas Using the Joint Multiasset Distribution 95 


intuitive appeal. It is now time to discuss the real-world framework in which correlation 
is an ever-present fact of life. 

We now take our log log utility model from the previous chapter and incorporate 
variance and covariance into it. Thus, our discrete portfolio return model at time ¢ 
becomes 


2 
R= wi te (7.2) 
i=0 


where rj; are the individual security returns at time ¢ for security i. 


Y= > Gnd + Ind. + 7%) (7.3) 


Equations 7.2 and 7.3 appear to have no explicit term for correlation. The reason 
such a term is not needed is that the returns are treated as a contemporaneous vector of 
returns. We recall that a vector is simply a list. In this case the list is a list of returns from 
the same period in time. 

Thus, any correlation between the different returns is already taken into account by 
treating all of the returns from a single period together as a group. The correlation is built 
into the calculation of the individual returns themselves. So we need add no special term 
to treat the correlation separately for the discrete case. 

It is important to note that the returns for each period must be taken together as a 
group for this type of analysis to work correctly. The returns for period t already include 
the effects of correlations between them and so must be taken together. Conversely, 
taking retums for security é through all periods separately would not take into account 
the effects of correlation between the different positions within a given period. That must 
be explicitly considered for each return calculated. 


The standard Sharpe Ratio developed by William F. Sharpe and now included as part of 
the CAPM model is given by 


Sharpe Ratio = (rp —7)/op (7.4) 


where the subscript, stands for portfolio return and 77 is the risk-free rate on short-term 
Treasury bills. Effectively the Sharpe ratio measures the amount of excess return of the 
portfolio per unit of risk taken. It is the measure of portfolio performance that is most 
widely used in the financial industry. Thus, by any reasonable standard it represents an 
important metric by which portfolio managers are judged. 


96 OPTIMAL PORTFOLIO MODELING 


A potentially more useful variation of the Sharpe Ratio would be to calculate within- 
period returns as usual, but treat overall returns as continuously compounded via the 
use of the logs of the returns. Effectively, we would be calculating an average of the logs 
of the returns and the standard deviation about that average. 

As further enhancement of this idea, the concept can be combined with the use of 
the log log utility function to obtain an objective function that incorporates the concept 
of maximizing the log log return per unit of log log risk undertaken. 

Thus, our modified log log Sharpe ratio would be based on the expected log log 
return less the riskless rate per unit of risk taken in a log log sense. We can define rzz, to 
be the expected log log return as follows: 


Tr = In + In. + 7p) (7.5) 


Inasimilar fashion, we have for the log log standard deviation the formula as follows: 


ox = VARGN( + In(1 + 7p) (7.8) 


THE EMPIRICAL DISTRIBUTION 


The disadvantages of using theoretical distributions such as the normal and lognormal 
have been previously discussed. Although such distributions offer some advantages in 
intuition on occasion, and can provide us with closed-form solutions, generally they are 
to be avoided. A much better practice is to use robust resampling from the empirical 
distribution to model one’s portfolio. This avoids all of the issues of nonnormality and 
fat tails. 

Essentially, the idea is to randomly select past periods and then use all of the stocks 
or trades from that period together as a group. This effectively captures the correla- 
tion between them in a simple, straightforward manner. It also obviates the need to deal 
with the correlations via the variance-covariance matrix with its associated Cholesky 
decomposition. 

The procedure is to randomly select each time period and then calculate the portfolio 
retums for the given weightings. Then take the desired utility function or goal function of 
the computed portfolio return. The final step is to maximize the goal function by varying 
the weights during successive resampling iterations. This step can be handled either by 
Solver or by the various multivariate optimization routines available in R. 

Using the formulas given earlier for the log log case we have: 


i 
n=) wire (7.7) 
= 


Money Management Formulas Using the Joint Multiasset Distribution 97 


where 7 are the individual security returns at time ¢ for security 7. 


U@® =o Gnd + nd +7) (7.8) 
7 


Intuitively the U(O function is to be interpreted as the utility of the log log function. 

We substitute the actual returns from randomly selected time ¢ into equation 7.7 and 
the resulting portfolio return 1; is used as input to the U(/) function per equation 7.8. 
The standard optimization routines from Excel, S, and R can be used to find the optimal 
portfolio weighting vector w. 


It is important to understand what the maximal log log function is and what it is not. The 
maximal log log model is that weighting of portfolio investment sizes that will maximize 
the long-run log log return of the portfolio. It is optimal in that sense only. We recall that, 
in general, it is possible to optimize only one thing at a time. 

By definition, the log log utility function includes a log term. One of these log func- 
tions maximizes the compounded return of the log of the wealth ratio. The other log 
function of the two provides us with a measure of the utility of the wealth that we are 
compounding. 

Tt is useful to note that the log function is concave. The slope of the line is every- 
where decreasing. Thus, the log function contains a kind of built in risk aversion when 
considered from an arithmetic viewpoint. This can be seen more clearly in Figure 7.1. 

The iterated log function has a double risk aversion built in. Each of the log functions 
adds additional concavity to the function. In that sense, it is clearly more risk averse than 


the simple maximal return formulas presented earlier. Figure 7.2 illustrates the additional 
concavity of the iterated log function. 

The reader will recall that one of the well-known problems with the maximal returns 
type of formulas is that they result in unacceptably large volatility. The conservative 
double-risk reduction of the iterated log function directly addresses this ina way that also 
optimizes a reasonable utility function. The reader is strongly encouraged to consider 
this function as the basis for one’s portfolio modeling efforts. 


The presence of correlation between variables can also impact the calculation of the 
Sharpe Ratio when one assumes the empirical distribution as the governing premise. 
There are really two issues in this discussion. 


OPTIMAL PORTFOLIO MODELING 


2.04 


0.54 


Figure 7.1 Log Utility Function 


0.65 


044 


Figure 7.2 Log Log Utility Function 


Money Management Formulas Using the Joint Multiasset Distribution 99 


First, the variables can be simultaneously correlated. When IBM goes up, then GM is 
likely to go up during the same period. This type of correlation goes by many different 
names. One would be simultaneous correlation, while another would be coincident cor- 
relation. When speaking in a time series context, the term might be correlation with lag 
2ero. 

The second type of correlation involves a serial dependence between one day and 
the next day’s trading. The efficiency of the markets guarantees us that this correlation 
is likely to be rather small. However, it should be considered anyway. The name for this 
is autocorrelation, implying that the data is correlated within at some earlier period. 
Another name for this type of correlation is serial correlation. For this type of correla- 
tion, the number of time periods between two correlated observations is known as the 
lag. 

There is another type of serial relationship that may not be a true correlation at all. 
In ordinary correlation, there is a presumed linear relationship between the variables, 
even if the variables are returns from the same stock separated by a certain number of 
days called lag. If the correlation is zero or nearly so, then we can say that there is little 
or no linear correlation between the variables under consideration. 

However, this does not preclude a nonlinear relationship between variables. There 
are very many possible nonlinear relationships that we could imagine. For example, it 
is well known that the volatility of the market is autocorrelated. If the market has been 
recently volatile, it is likely to continue to be volatile. One way to measure this is to look 
at the correlation coefficient for consecutive values of the returns, squared terms for the 
market. There is a consistent positive correlation for this variable despite the fact that 
the linear values of the returns generally show litile or no correlation. 


In order to deal with coincident or coterminal correlation from an empirical multivariate 
distribution we need to obtain our random sample from all of the variables in question 
for that time period. It is not sufficient to only sample one variable from a given time 
period and another variable from a different time period. It would not effectively capture 
the correlation that may be present. 

Because we know that correlations between return variables during the same time 
period will be highly correlated, it is imperative that all of the variables come from the 
same period. This will effectively capture the correlation, as well as enable us to consider 
the effects of the correlation on the variance. 

Altematively, if we were only to capture variables from different periods, we are as- 
sured that they would not have the necessary correlations. The efficient markets hypoth- 
esis assures us of this fact. Thus, the only way to perform an analysis using the empirical 


100 OPTIMAL PORTFOLIO MODELING 


distribution is to collect our randomly sampled data as a set of points or as a vector of 
the given variables in question. 


Modeling Nonlinearity and Autocorrelation 


In order to randomly sample from the empirical distribution and compensate for nonlin- 
ear relationships, we must take certain precautions. Ideally, we wish to include all of the 
variables from a time period according to the previous discussion. However, often that 
is not enough. It will only eliminate correlations and relationships that exist within that 
time period. 

If we suspect nonlinear relationships exist, then we need to take a more general 
approach. We need to remember that these relationships may not even show up as sim- 
ple linear correlations. Thus, they may exist, but we cannot even guess at their correct 
mathematical form. 

There may also exist autocorrelations and cross-correlations between variables at 
different time lags and for different periods of time. In order to capture these, it is not 
sufficient to merely include all of the variables from one period in time. We must include 
variables from a recent window in time in order to model these sequential relationships 
in time. 

Therefore, the correct way to randomly sample our data would be to randomly select 
atime period {. Then, each time ¢ represents a window of multivariate data, the data 
variables for each time period must be captured. But more than that, the same set of 
variables for previous periods prior to time / must be captured and used as our sample. 

Ordinarily, for statistical testing and modeling, we would prefer to consider nonover- 
lapping periods. We would also consider each one only once, The idea is to avoid double 
counting and keep the observations statistically independent from one another, 

However, for our empirical sampling models, we can and should allow overlapping 
time periods. Effectively, we are letting the sampling technique perform our analysis for 
us, even if the data we feed it have dependencies, as they must. We are resampling at 
random times and not just random data. 

Those who feel uncomfortable with this technique can adopt fixed periods and sim- 
ply resample from those. However, this technique has the drawback of reducing the num- 
ber of permissible data samples by a factor of the window size. Suppose we had asample 
of 500 days and chose to model a 20-day look-back window. This will give us about 24 
nonoverlapping time periods. But if we allow any day to be randomly selected, the num- 
ber of possible starting points is now about 480. Note that we need to allow for an initial 
look-back window of 20 days at the beginning of our data. 

To summarize, either method will produce good results. However, the method of 
randomly sampling any time period is generally to be preferred, provided a sufficiently 
large number of samples is used. 


data. Invariably, the issue of good-quality data arises in every study based on 
historical data. This chapter deals with the issues of ensuring good data and 
how to deal with empirical data in general. 

In particular, the discussion inevitably leads to how to synchronize data between 
different markets in different time zones, with consideration to the varying days and hol- 
iday schedules in different countries. In order to conduct a proper study, most analyses 
require the use of net change data in lieu of price level data. In addition, the chapter 
discusses the need to use intraday highs and lows in one’s work. 

As always, the focus is on predictive studies versus retrospective nonpredictive stud- 
ies. Numerous data pitfalls are discussed, including the fact that adjusted data may be 
erroneous, for many possible reasons. The user is urged to maintain his own adjusted 
database for the sake of data integrity. 

Finally, this chapter deals with the important topic of robust resampling methods 
to analyze the data. The reader is shown how robust methods can be used to develop 
confidence limits for everyday statistics in a way that does not depend on the quirks of 
the underlying distribution. 

Inthis way, the serious portfolio modeler can be freed from the curse of the fat-tailed 
underlying distribution. 


M ost work with models is based on backtesting a given concept with historical 


102 OPTIMAL PORTFOLIO MODELING: 


ASSURING GOOD DATA 


Any historical study of investment vehicles is only as good as its data source. For that 
reason, it is important to discuss the data sources and the myriad of pitfalls that can 
befall someone who uses historical data sources. First, we shall deal with the seemingly 
simple matter of assuring data quality. 

Perhaps the simplest and best method to assure quality is to find two reputable and 
independent sources of data and to simply compare them against each other. Such a test 
should include a simple comparison of same price on the same date. It would also include 
acount of missing days for each vendor. An additional test should be made to find extra 
days that may have been added or omitted. 

For daily or weekly data, the data fields should include Open, High, Low, Close, and 
Volume. An often-overlooked factor is that stock data needs to be adjusted. Such ad- 
justments include stock splits, as well as dividends and other corporate distributions, 
Thus, accompanying each stock should be a file of adjustment data or something simi- 
lar embedded in the data itself. For each date, the date should be explicitly included as 
a separate field. For Futures or Options, the Open Interest field should be available as 
well. 


SYNCHRONIZE DATA 


There are many reasons data must be synchronized. One reason is that some markets do 
not trade when other markets do. For example, the bond markets observe more holidays 
than the equity markets. Different countries observe different holiday schedules. Thus, 
one market may be open and a similar market ina different country is closed for the local 
holiday. 

Synchronization is important for several reasons. First, most studies of price move- 
ments should be performed using net change data. Thus, for comparability the net 
changes should be for the same period of time. Irrespective of whether the time frame is 
daily, weekly, or even hourly, it should reflect the same period of time. 

For example, when we are trying to measure the correlation between two stocks, 
there will be subile bias issues if sometimes the net change figure is for two days com- 
pared to one day. By its very nature the two-day net change will typically be about 1.41 
times larger than the one-day change. This will impact the correlation calculation that 
relies on the cross-product between the two data series. 

There are several ways to synchronize data. One way is to simply defenestrate any 
days that show missing data for whatever reason. We simply remove those days from our 
sample in ail of our data series. If even one day is missing from, say, a German stock, that 


Proper Backtesting for Portfolio Models 103 


day is removed from all of the other series as well. Thus, each net change in the data we 
keep is still comparable to the other data series. 

Another way to synchronize the data is to keep only the latest data for which all 
of the dates that are equal. Yet another technique is to combine added dates so as to 
artificially create matching pairs of dates. 


HANGES NOT LEVELS 


It is important to use the net changes rather than price levels for several reasons, One 
of the more important of these reasons is that the levels are always autocorrelated in a 
random walk time series. It is easy to understand why this is so. For an additive random 
walk model, today’s price is the sum of yesterday’s price plus a small random innova- 
tion. Each successive price level is the sum of all of the recent innovations. Thus, it is 
correlated with each, at least to some extent. For example, the sum of the last 20 days’ 
innovations will not change much when one more daily observation is added. The pre- 
vious 20 days still represent about 95 percent of the total variation. It follows that the 
correlation will be comparably large. 

In fact, empirical studies done by the author show that artificially constructed ran- 
dom walks are generally positively serially correlated in the neighborhood of 65 per- 
cent to 95 percent. These correlations are quite large and will doom any statistical study 
to many false positives. It is to be emphasized that these were randomly generated 
data studies, and thus, there should be no correlation except what was artificially in- 
duced by the process. The inescapable conclusion is that using levels of prices is a 
faulty process and will cause spurious correlations where no real correlation actually 
exists. 

Strictly speaking, this applies only to an additive random walk. As previously dis- 
cussed, stock markets are probably best modeled as multiplicative random walks. This 
type of model best accounts for the positive compound returns of stocks over very long 
periods of time. However, the same arguments are valid here as well, because a multi- 
plicative model can also be expressed as an additive model in the logs of prices. Thus, 
large spurious correlations will be induced in the logs by using price levels. Simply put, 
the research based on correlations involving price levels will find correlations that are 
not real. 

There are several ways to calculate the net change for a given time period. The sim- 
plest is to take the net difference in points. This is literally the net change as measured 
in points 

Over time, stock prices tend to drift higher. The longer the time, the greater the 
upward drift. If we are studying a period of time during which the drift has been large, 
it is quite likely that the net point changes later in the time series will be larger than 


104 OPTIMAL PORTFOLIO MODELING 


those earlier. The variance of the time series will increase with time. Statisticians call 
this heleroskedasticity. It simply means different variances. 

Statistically speaking, it is a situation that we wish to avoid. The variance in the sec- 
ond half of the series will be larger than the variance in the first half. Thus, the standard 
deviation of the second half will be larger than the first half. Equally important, the data 
will appear to be dominated by the latter part of the price series. 

To address the problem of heteroskedasticity, we need only recast the net change 
as a percent change. Generally speaking, this will render the changes of the first half 
comparable to those of the second half. For any longer-term studies of more than a year, 
using percent changes should be the norm and is highly recommended as a best practice. 
For time periods shorter than a year, the net point change is probably good enough. 

We have discussed the fact that the equity markets exhibit long-term compounded 
growth. This is in contrast to other markets, such as commodities and bonds, that gen- 
erally show no long-term growth. The ideal way to measure price change in compound 
growth situations is to use the logs of price changes. We take the relative price change 
and then take its natural log. 


Relative price change = In((P,,1 + D)/P:) (8.1) 
Where P, is the price at time ¢ and D is the shareholder distribution (usually a divi- 


dend or split). 


ONLY USE INFORMATION FROM THE PAST 


Anyone who has watched a science-fiction movie about time travel knows the dangers 
of knowledge of the future. It seems that every work in the field invariably raises the 
question of the great time travel paradox. If you go back into the past and kill one of your 
ancestors before he passed on his genes, you will never exist. If you never existed, then 
how did you go back into the past to kill your ancestor? 

The theme of altering the future seems to be mandatory for this genre. The plot often 
revolves around the temptation to use one’s knowledge of the future to improve it and 
the stern warning that any such attempt is evil and ultimately doomed to failure. 

Such dire warnings apply equally well to one who performs studies based on histor- 
ical data. Only use information from the past. One should never let knowledge of the 
future creep into the results. 

This point may seem obvious and almost need not be stated. Yet countless studies 
suffer from the problem of knowledge of the future. In fact, the problem is so insidious 
that it tends to arise without the researcher being cognizant of the fact. 


Proper Backtesting for Portfolio Models 105 


One way knowledge of the future can creep in is through simple software mistakes. 
In most programming languages such as R, there is a facility for specifying the date of 
a particular price. Often, this boils down to a simple subscript or integer index. If we 
were programming a simple moving average and the subscript were off by one, then the 
average might include data from one day in the future. In Excel, the analogous situation 
would be when a range specification is off by one or more. If the range includes informa- 
tion from the future, the results are likely to be too good to be true. 

One good diagnostic is to be suspicious whenever the results of a study are too good 
to be true. It is natural to be happy when a study seems to work out well. Rather than 
becoming elated that. a study has seemingly worked out very well, we should immediately 
assume that knowledge of the future has somehow crept in and figure out how and why. 

Another insidious form of knowledge of the future occurs when using retroactively 
adjusted or edited data. If we were doing a study based on fundamental data and arbitrar- 
ily used a historical database provided by some supposedly reliable vendor, we may be 
subject to significant knowledge of the future biases. Many vendors retroactively adjust 
their data to reflect the latest reported changes. Sometimes these are accounting changes 
that are only announced months, or even years, later. 

The classic example is the Enron fraudulent accounting scandal. Just prior to the 
revelations about Enron, the fundamental data seemed to scream that the stock was 
incredibly cheap by any reasonable accounting valuation metric. Yet, when viewed in the 
light of later released and retrospectively corrected accounting statements that reflected 
the true situation, it was clear that the stock was a scam. The problem was, the corrected 
accounting data were only made available years later. At the time of Enron’s demise, one 
could not have known simply by looking at the current published information. 

Enron is an extreme case, but many more subtle cases arise all the time. The key 
issue is knowledge of the future. Any study that includes such knowledge is suspect, and 
such trading strategies are to be avoided. 

Historical price data can include such pitfalls as well. For example, it is fairly com- 
mon for price reports and quotes to be erroneous. In particular, a single quote can be 
typed incorrectly, or data transmission problems can result in a single bit being dropped 
and a digit will be off by 1. Most such errors will show up as the reported high or low price 
for the day. The reason for this is simple. Errors tend to be large. A 5 that is changed toa 
6 can result in a 20 percent error in the reported high for the day if the price is changed 
from a 51.14 to 61.14, A 51.14 price can be morphed into 52.14. It all depends on the 
decimal place in which the error occurs. 

A good way to avoid such errors is to use a price continuity filter that screens out 
discontinuous jumps and large changes. At first blush, it may seem that the way to deal 
with these is to retroactively adjust the prices when the corrected data is known. The 
point of this section of the book should be clear by now. You cannot use knowledge of 
the future. If the corrected prices are not known at the time, the information should 


106 OPTIMAL PORTFOLIO MODELING 


only be used when it is known. Obviously, this requires some additional bookkeeping or 
complexity in the way data is handled. At a minimum, new information should be time 
stamped and never altered once it is captured. Even if corrected information comes in, 
that, too, should be time stamped and only made available to a retroactive study alter it 
is known. Databases that are permanently burned onto CDs or DVDs are excellent ways 
to ensure against retroactive data corruption. The data on a write-once medium cannot 
be altered. Even so, time-stamping individual transactions is still essential. 

Sample selection can also introduce knowledge of the future biases into a study. For 
example, it is well known that mid-cap stocks that have grown large enough to be in- 
cluded in the S&P 500 index provide a profitable universe of stocks to buy. Several hedge 
funds and astute traders have developed strategies that seek to identify which stocks will 
be dropped and which added to the index. The strategy usually involves buying those that 
are to be added and short selling those that are to be dropped. Implicit in this theory is 
the idea that fixed-strategy index funds will be required to buy the newly added stocks 
and sell the dropped issues. The hedge fund hopes to profit on the differential regardless 
of market direction. 

Suppose we start with a list of the current members of the S&P 500 index. We then 
propose to perform a historical study of these issues. Where is the harm in that? 

There is a subtle bias, contained in such a study. By selecting current members of the 
index, we are incorporating knowledge of the future. We could not know with certainty 
which stocks would be in the index five years later. Nor could we know which would 
be dropped. By avoiding all issues that were subsequently dropped, we may well have a 
sample that avoids all negative growth and companies that fell by the wayside. Similarly, 
our sample includes all those companies that were not members of the list years ago but 
joined the select group presumably through superior growth. This illustrates that knowl- 
edge of the future need not be quantitative, but can be transmitted merely by qualitative 
membership in a class or group, 


PREDICTIVE STUDIES VE! 
STUDIES 


'S NONPREDICTIVE 


Recently, it has become a fad on Wall Street to look at coincident correlations of price 
levels as though they are some how predictive. An analyst will present a chart of two 
price levels that are both rising and nod sagely that there is a correlation between the 
two. The entire analysis is predicated upon the fact that both are rising at the same time. 
Alternatively, the chart might show that both are falling at the same time. 

In his book Statistics on the Table (Harvard University Press, 1999), Professor 
Stephen M. Stigler uses the example of a tree and a boy. Both are growing taller every 
year. But the two are not correlated. The growth spurts of a tree will have much to do 


Proper Backtesting for Portfolio Models 107 


with its own life cycle, sunlight, and annual rainfall. The boy’s growth rate will corre- 
spond to his feeding, family history, and life cycle—notably, puberty. The annual differ- 
ences will show little correlation, even though both are growing. A correlation analysis 
of two time series that are both growing will often show spurious correlations simply 
because the analysis is based on levels of a price series. 

Knowing how much the tree grew last year will not allow any reasonable prediction 
of what the boy’s growth will be in the next year. To be predictive, we must have two 
properties. First, there must be a correlation. Second, the correlation must extend from 
the past values of the predictor variable to the future values of the variable that we wish 
to predict. In other words, the predictor must lead to the variable to be predicted. A 
relationship that is only correlated in the present is at best a descriptive relationship. It 
shows only comovement, but is not predictive. 

It should be emphasized that this discussion is directed to the question of what is a 
predictive relationship. This is to be distinguished from the coincident correlation that 
most stocks have with each other and the market overall. As we saw earlier, the coin- 
cident correlation relationship can be very important in analyzing a portfolio model and 
how best to optimize it. Two stocks that move together do not serve to reduce risk as 
well as two stocks with little correlation. The best of all is two stocks that are negatively 
correlated. However, knowing this does not allow us to predict the direction in which 
the two stocks will move. We only know they will move together. 


Often traders will seek to use limit orders or stop-loss orders for entry and exit of market 
positions. Modeling these can be tricky. For example, a limit order at a price of 50 may 
appear to be a simple thing to model. But if one only uses closing data, and yesterday 
closed at 50.20 and today at 50.30, the order may or may not have been executed. Based 
solely on closing data, we cannot tell what the true story is. 

Obviously, what is needed is the ability to include intraday highs and lows in our 
analysis. But even that remedy is not without pitfalls. Suppose the low for the day was 
exactly 50 and our order was at 50. Again, there is no way to tell if the order would have 
been executed or not executed with a report of “Stock ahead” in line. Although this is 
a relatively rare case, it still must be addressed by the serious researcher. One way to 
handle this is to assume that the limit order is only executed a percentage of the time if 
it coincides with the high or low. This percentage can vary from zero to 100 percent. 

Another way to handle it is to analyze it both ways. Once with the assumption that 
the order is always executed and once assuming it is never filled. If the two are signifi- 
cantly different, then the matter should be investigated further. If not, then it may be a 


108 OPTIMAL PORTFOLIO MODELING 


nonissue. In any event, it is likely that limit orders at the extrema for the time period will 
be adversely selected. Simply put, you will get. orders filled when you do not. want them 
filled and never get them when you do want them filled. 

A similar situation arises with stop orders. One would think that when a stop price 
is hit the trade will occur at that price. In fact, this rarely happens. Once the stop price 
is hit or exceeded, the order then becomes actionable—usually as a market order unless 
a limit was specified. In that eventuality, the trader must expect to pay the then-current 
bid-ask spread. So too, for any market order that is given. One must expect to pay the 
spread. This is a phenomenon often called slippage in the vernacular of Wall Street. Any 
historical study must explicitly take into account slippage by some means. Naturally, we 
must add commissions to the slippage. 

One final caveat is in order. The highs and lows for the time period are notoriously 
susceptible to error. Although the use of highs and lows is highly recommended for trad- 
ing studies, it must be accompanied by thorough scrutiny of the quality of such data. 


ADJUSTED DATA MAY BE ERRONEOUS 


Yahoo finance has an excellent source of historical price data for stocks. In many cases, 
the daily price histories go back 40 or 50 years. The short Internet URL for Yahoo is: 


finance .yahoo.com 


Google finance has recently entered the arena with its own price history data as well. 
Undoubtedly, the competition between the two companies will stimulate even better 
data sources and improved quality in the future. The short form of the Google address is: 


www.google.com/finance 


Yahoo has a data column that displays the adjusted closing prices. The open, high, 
low, and close data are unadjusted. This means that the user must adjust the other fields 
individually. This is easily accomplished by either taking the difference between the ad- 
justed close and the original close, or by taking the ratio of the two prices as an adjust- 
ment factor. 

However, there is a hidden catch with this procedure that can lead to serious prob- 
lems in the data. This problem is round-off error. Yahoo! and most other data vendors 
round their data to two decimal places. A classic example of this would be Microsoft 
stock. From the 1980s to the present, MSFT has enjoyed dramatic compounded growth. 
The stock followed the good fortunes of the company and maintained an ever-increasing 
upward trajectory, at least until the year 2000. 

The upward trajectory of prices is one piece of the problem. The other is the fact 
that prices are reported and stored in fixed decimal format. Typical, is a two-decimal- 
place format, but a few data sources preserve four-decimal-place accuracy. In the case 


Proper Backtesting for Portfolio Models 109 


of Microsoft, the Yahoo! finance site shows the adjusted prices for the stock from April 
18, 1986, through April 11, 1986, to be $.09—an adjusted price of 9 cents per share. For 
seven straight days, the price was supposedly unchanged. 

Obviously, something is wrong with this picture. At the time, Microsoft was a darling 
of Wall Street, one of the high-flying glamour favorites. In fact, the problem is the double 
whammy of the extreme growth in Microsoft to date and the use of the fixed decimal 
format by the data vendors. Thus, the real adjusted price for MSFT may have been any- 
where from 8.5 cents to 9.5 cents. We have no way to know. But when the price crossed 
the round-off threshold from above 8.5 cents to below 8.5 cents, the price appeared to 
drop precipitously from 9 to 8 cents per share. Thus, when we look at the net change on 
a percentage basis, it appears to be a rather largish 11 percent decline. In fact, the drop 
at the time may have been minimal, but just enough to push the stock below the 8.50 
threshold on an adjusted basis. We cannot tell from the adjusted data as it is. 


Clearly, the foregoing discussion indicates a serious problem with the adjusted historical 
price data provided by many, and perhaps most, vendors. Effectively, the original data for 
MSFT prices have been reduced to only a single digit. All numerical significance greater 
than a single significant digit has been lost. Fortunately, there is a remedy. The solution 
is for investors to do their own data adjustment. 

Essentially, the key is to find a vendor who offers the necessary data to perform 
your own adjustments. Yahoo! is one such vendor. It offers stock-split data as well as 
distribution data such as dividends. Then you can perform the necessary adjustments 
to as many decimal places as your software can handle. Under current double-precision 
technology, that is usually about 16 decimal places. Using that level of precision, the 
calculations are more than adequate to fix the round-off error problem. 


There are several other data issues that can arise. One is the issue with fundamental data 
whereby the data is retroactively adjusted by the data vendor. Literally, the vendor will 
go back and edit his historical database as revised information is reported. The trouble 
with this is that it renders any sort of historical study impossible and unreliable. The only 
way to do this is to either obtain dual data sets from the vendor or do it yourself. 

A dual data set would include a complete time-stamped database of information that 
was available at the time. It must have a guarantee that there were and will be no retro- 
Spective alterations. The other part of the dual data set would be the retrospectively 


110 OPTIMAL PORTFOLIO MODELING 


adjusted data. For the purposes of the investment manager, the latter data is nearly 
useless. 

There are also numerous errors that are reported in simple pricing data. Most often, 
these occur in the highs and lows for the day. Simple filters to assure continuity of price 
histories can go a long way toward identifying and eliminating such errors. However, it 
is important to mention that if these cannot be identified and eliminated by simple filters, 
they should be captured, time stamped, and recorded as they were known at the time, 
just as was done for the fundamental data. 


‘TABULATE AND SAVE THE DETAILED RESULTS 
WITH DATES 


Any good study should, as a minimum, include a detailed tabulation of which days were 
included if a selection of some sort was made. This detailed tabulation can be used as the 
basis for portfolio modeling when the results of the study are subsequently utilized for 
trading. The saved results become the inputs for the robust analysis of optimal portfolio 
position sizes and correlation studies. 


OVERLAPPING DATES ARE IMPORTANT FOR 
CORRELATIONS 


It is important to synchronize and reconcile dates in one’s data, Some people have always 
thought of the bond market as a gentleman’s club. Although ladies are now permitted to 
enter, there is still some truth to this perception. One example of this is the fact that the 
bond market, in its wisdom, seems to have more holidays than the equity markets. 

Suppose we were doing a study and using yesterday's change in bond prices to help 
predict the stock market for tomorrow. However, the bond market was closed for a holi- 
day yesterday. Does that mean that the price change from two days ago is now predictive 
of today and tomorrow in the stock market? Alternatively, should we exclude this signal 
from one day or the other? These questions can be tested, although the samples will be 
small. Nevertheless, the issues are real and must be handled by data synchronization. 

A similar issue arises when dealing with international markets. Different countries 
have completely different holiday schedules. Thus, data synchronization is important in 
these realms as well. 

Another critical consideration in the international arena is to get the time zones right. 
For example, the FTSE is traded in London and opens before the New York markets start 
trading. However, that does not mean that it is safe to use the FTSE price change for 
today as a predictor of the New York market move for today. In fact, both markets share 


Proper Backtesting for Portfolio Models iii 


some trading hours in common. Currently, it is the first hour of New York trading that 
coincides with the last hour of London trading. 

Thus, there is a built-in correlation, but it is a simultaneous relationship that cannot 
be used for prediction without considerable care. Generally, the use of conditional pre- 
dictors based on intermarket relationships in which the markets overlap trading hours 
should be avoided. 


CALCULATE MEAN, 


As a minimum, any good study should calculate at least a few key statistics. Certainly it 
is important to find the mean return. Then either the standard deviation or the variance 
should be computed. Either one of the latter is usually sufficient. Summary statistics 
should include a recitation of n, the number of observations included. Often overlooked, 
but always quite useful, is the probability of a win or winning percentage of trades. 

Other helpful statistics would include a summary of the quantiles of the return dis- 
tribution, the skew, and kurtosis. These latter figures can be readily found in R. Many 
traders prefer to use the drawdown statistic, which is a measure of the decline from the 
highest level of the equity curve to the nadir of its greatest drop. Although the statistic is 
somewhat flawed, nevertheless, it is in common use and generally well understood. The 
flaws are that it is a highly variable statistic and subject to runs of good and bad luck, 
more so than measures of risk such as standard deviation. The drawdown metric is not a 
relative measure. It is also subject to manipulation. For example, if the largest drawdown 
Was seven years ago, then the unscrupulous operator need only present the last six years 
of history to exclude the offending item. 

Another important measure to calculate is the Sharpe ratio and perhaps the Sortino 
ratio as well. These are generally accepted measures of risk and can be quite helpful in 
evaluating the results of a study. It is also useful to perform the beta regression analysis, 
From this we get the alpha as a measure of relative market performance. We also calcu- 
late the beta to look at the relative market multiplier and as a measure of correlated risk. 
The F? is helpful as a measure of market correlated risk. 


ROBUST METHODS TO FIND STATISTICS 


In earlier chapters of this book, we discussed the fact that the normal and lognormal 
are inadequate distributions to perfectly describe the distribution of market returns. In 
order to analyze returns without assumptions about the underlying distribution, it is best 
to use robust resampling methods. Essentially, these randomize the data and allow us to 


112 OPTIMAL PORTFOLIO MODELING 


repeatedly resample from the distribution. This facilitates building confidence intervals 
and quantiles for the actual empirical distribution, whatever it may be. 

Robust methods can also be used with multivariate distributions. However, in this 
case the usual best practice is to combine together observations for each variable at the 
same time. The import of this can be seen if one looks at the wild fluctuations around 
the time of the 1987 stock market crash. Not only did the stock market exhibit extraor- 
dinary volatility, but most other markets did as well. Markets such as bonds, currencies, 
gold, and oil all exhibited extraordinary volatility. Therefore, keeping those observations 
together serves to preserve the correlation structure of the intermarket relationships. 

An even better practice is to resample the data at random times, such as random 
weeks or months, but to keep the sequence of data at: the random time intact. This tech- 
nique can help to preserve any suspected or unknown sequential relationship in the data 
set. If we approach the random resampling in this manner, then we can be less concerned 
about issues such as the fact that the GARCH models have shown us that the volatility 
as measured by variance is not stationary but seems to change regimes over time. 


CONFIDENCE LIMITS FOR ROBUST STATISTICS 


The importance of confidence limits cannot be overemphasized. For example, if we wish 
to know if a given mean is really greater than zero, we can simply look at the confidence 
limits at the 95 percent level. If the range bounded by the limits does not include zero, 
then we can be confident that the mean is truly above zero and not simply due to chance 
this time around, 

To compute any robust statistic, we resample from the original data a subset of, say, 
80 percent of the total observations in the data set. For each resampled subset, we calcu- 
late the desired statistics—mean, standard deviation, variance, correlation, and anything 
else, Each such statistic is saved in an array of, say, 1,000 items. To tabulate the frequency 
distribution for that particular statistic, we sort the array from smallest to largest. Assum- 
ing we chose 1,000 elements in our array, then the 5 percent level is given by the element 
in the sorted array at position 50, The element at position 950 represents the 95 percent 
percentile of the cumulative probability distribution for that statistic. Thus, the position 
can be read directly as a probability percentile that is usually abbreviated as p. 


lier chapters have laid a good foundation. Thus, to some extent the work of finalizing 
is made easier. For the most part, the reader should have now acquired the necessary 
skills to understand how to accomplish the goals of portfolio modeling. 

This chapter discusses the broad subject of when to choose the theoretical and when 
to choose the empirical distribution for one’s model. In general, the empirical is to be 
preferred, but there are important exceptions discussed in the text. 

From there, the concepts naturally flow to the choice of the objective function 
or utility model. A thorough discourse follows on whether you should choose the 
ubiquitous Sharpe Ratio or the innovative Log Log utility model proposed in this 
book. 

Beyond that, the chapter addresses the ideas involved in model simulation. Of spe- 
cial import are the concepts relating to the joint multivariate distribution, with the pos- 
sibility of correlative relationships. The reader is shown how to deal with this particular 
circumstance. 

Finally, the chapter and the book closes with a discussion on the differences be- 
tween professional money managers and simple private investors. The reader is shown 
how to adapt the ideas and objective functions of this book to his or her particular 
circumstance. 


| nmany ways, the goal of this chapter is to put it all together, To a large extent, the ear- 


13 


114 OPTIMAL PORTFOLIO MODELING 


CHOOSING THE THEORETICAL DISTRIBUTION 


The theoretical distributions offer several advantages over using the empirical distribu- 
tions. Following are four of these advantages: 


1. The theoretical distributions offer closed form solutions for many formulas and 
statistics, This can be a mathematical convenience. 

2. The properties are well known. Distributions such as the normal and log normal can 
be characterized by their respective mean and variance. 


3. Tables and analytical tools are widely available. They are usually included in soft- 
ware packages such as Excel and R. 

4, Working with the theoretical distributions is much less computer intensive than 
working with the empirical distribution. Savings in computer time can be on the 
order of thousandfold when compared to robust resampling methods. 


Offsetting these advantage are four disadvantages to using the theoretical distri- 
butions: 


1. The tails of relative return distributions tend to be too fat. This implies the ex- 
istence of outliers in the data, or at least a nonstationary variance per the GARCH 
models. 

2. The usual statistical tests will tend to be biased when estimating significance. In 
particular, tests such as the ¢ test will tend to find significance somewhat more than 
5 percent of the time at the 5 percent level. Thus, all results are at least slightly 
suspect. 

3. The existence of the fat tails in stock returns and other speculative markets is 
widely known. Anyone who seeks to sell his or her results or services to others will 
undoubtedly encounter at least mild resistance to the use of the standard theoretical 
distributions. 

4. The theoretical normal family of distributions has difficulties when we attempt 
to apply them to skewed distributions or extremely exotic distributions. A good 
example of this would be options and spreads involving either options or futures. 
Other examples of highly skewed distributions would include complex strategies 
involving stop losses and profit targets. Any model that includes options or stop loss 
and profit target strategies is almost certainly nonnormal and should not use the 
theoretical distributions. 


The Combined Optimal Portfolio Model m5 


The empirical distribution is the actual distribution that we observe in the data. The goal 
is to make as few assumptions about the data as possible with respect to distribution. 
We also seek to avoid any assumption about coterminal or lagged correlations in the 
data. Ideally, we also wish to avoid assumptions with respect to coterminal or lagged 
intermarket relationships. 

Using resampling methods, we are essentially asking the question, “If all we know 
about the data is contained in the data, then what are the chances that this sample could 
have arisen by chance?” Ideally, we make no assumptions about the distribution, other 
than this is the data. This question is then answered by resampling the data perhaps 100 
to 1,000 times. Each time the data is resampled, we use slightly smaller samples than the 
entire data set. Thus, to the extent that the actual distribution of the data is skewed or 
has fat tails, the resampling will exhibit similar properties as well. 

We can summarize the advantages of using the empirical distributions as follows: 


The portfolio modeler makes minimal assumptions about the nature of the underly- 
ing probability distribution. 

The use of robust methods is nonparametric. It does not rely on a proper estimation 
of the usual parameters of the normal distribution. 

Robust methods directly provide the necessary quantiles of the distribution of the 
sample means, standard deviations, and other desired statistics. They provide a 
straightforward way to test for significance and derive confidence intervals and p 
values. In order to interpret the data, one does not even need a statistics book. The 
percentiles of the data itself will give the required p values. 

Robust methods require no knowledge of the mathematical properties of the normal 
or log normal distributions, because they are not used. Rather, the skill set to perform 
robust resampling requires only knowledge of how to calculate the desired statistic 
and how to sort the multiple resampled statistics. 

Robust methods can be applied to extremely skewed and exotic distributions, such 
as the distribution of option and spread returns. In contrast, the theoretical distri- 
butions do not apply at all in any straightforward fashion to these kinds of trad- 
ing instruments. The only way one could model options with a theoretical distribu- 
tion would be to use the underlying asset as a lognormal and then work through 
the complicated mathematics of the distribution implied by the Black-Scholes 
model. 

e The Black-Scholes model is predicated on the concept of a log-normal distribution of 
price changes in the underlying stock. The result would be a new exotic theoretical 
distribution. However, it would be neither normal nor lognormal, but rather, highly 


116 OPTIMAL PORTFOLIO MODELING 


skewed. Any analysis of the distribution would have to be derived by the user using 
higher math. Clearly, the empirical distribution is a far superior choice. 


There are three major drawbacks of using the empirical distribution: 


1. The entire distribution must be saved. It is no longer possible to simply summarize 
the data by the mean and standard deviation, knowing that these will adequately 
define the underlying normal or log normal distribution. 

2. Considerably more computer time will be spent performing the resampling proce- 
dures hundreds to thousand of times, as compared to the closed form statistical cal- 
culations of the standard distributions. Fortunately, computing power is becoming 
still cheaper, so this disadvantage diminishes with time. 

3. Using robust methods can sometimes confuse the underlying intuition that one can 
obtain from a thoughtful understanding of the normal and log normal distribution. 


The latter disadvantage is one of the reasons that this book has discussed both the 
theoretical and the empirical distributions. The theoretical has offered us intuition and 
insights into how markets work that may be simpler to understand than a robust number- 
crunching exercise. Yet, both techniques will yield the right answer. 

You should now have the tools and knowledge to make an intelligent choice between 
the theoretical and empirical. In general, the empirical methods will give better answers. 
Occasionally, the empirical distribution together with robust methods may be the only 
model that will give a reasonably accurate answer. 


SELECTING SHARPE VERSUS A LOG LOG OBJECTIVE 
FUNCTION 


Sometimes the organization in which one works will dictate an objective function. Per- 
haps more to the point, the quarterly or annual bonus may be tied to a certain perfor- 
mance metric. In such a circumstance, the choice of an investment objective function 
becomes abundantly clear to the portfolio manager. 

However, for those who have more freedom to choose their performance metric, we 
will discuss the merits of the two most common. First, the Sharpe Ratio is ubiquitous 
in portfolio performance evaluation. It serves as a de facto standard in the capital man- 
agement industry. It is an accepted metric and is generally welcome and understood in 
almost any context in the investment world, The ratio measures the excess return accru- 
ing to a portfolio per unit of risk. 

By contrast, the log log objective function helps to maximize the long-term com- 
pounded utility of wealth to the manager and potentially to the client. The inherent ad- 
vantages of this should not be ignored. They are significant. Forthe manager or individual 


The Combined Optimal Portfolio Model 117 


1.0 


054 


0.05 


0.55 


Figure 9.1 Iterated Log Function 


who is interested in achieving a very high compounded return while still maintaining an 
acceptable risk profile, this measure is to be preferred. 

It should be emphasized that the iterated log function is a conservative function (see 
Figure 9.1). The two log functions offer double concavity that is a more conservative risk 
measurement than that offered by a single log function as in the maximal compounded 
return formula. In one sense, the iterated log function represents using the log function 
itself as a measure of risk. 

Interested readers are invited to compare the form of the various log functions used 
here to similar formulas given by Shannon in his famous work on information theory. 
In a like fashion, you should note that the formulas are quite similar to those given by 
Hamilton with respect to the concept of entropy in thermodynamics. 


Generally, the best practices for performing model simulation involve trying to repli- 
cate the underlying process of the market. When we model multiple securities or invest- 
ments, we can model on the actual data itself in its original order. However, we can also 


118 OPTIMAL PORTFOLIO MODELING: 


randomly resample from our data set in order to establish confidence bands for the statis- 
tics of interest. Naturally, random resampling offers the best method to model our port- 
folio and trading techniques in an assumption and distribution free manner. 

There are three primary ways to perform robust resampling of multivariate data: 


1. The simplest form of random resampling is io select a sub-sample of our data 
without any constraints. Under this regime, each stock i from each period ¢ has 
an equal probability to be selected. Implicit in this schema is the assumption that 
stock returns are independent of one another within the period and unrelated from 
period to period. Of course, we know that stock returns are highly correlated with 
one another within the same period. This simple form of sampling ignores that fact 
and this is likely to lead to spurious results that do not reflect the realities of the 
markets. This method is not recommended except perhaps as a cross check on the 
assumptions. 

2. To avoid the problems induced by contemporaneous correlations between assets 
we can and should restrict our resampling to observations from the same period. 
Under this form of resampling, we would select all stocks within period t. However, 
the period t itself would be selected randomly. This type of resampling can generally 
be recommended, and it is often quite good. It does effectively capture the existing 
correlations and coincident co-movements between stocks. 

The major drawback to this technique is that it ignores any sequential correlations 
or cross market correlations that may exist from one period to the subsequent pe- 
riod. However, generally these correlations are quite small and can often be ignored. 

Another area of concern is the situation in which there exist nonlinear serial re- 
lationships. It is possible for no significant simple linear correlations to exist, but 
nonlinear relationships still remain from period to period. A good example of this is 
the GARCH class of models. From these we know that there is often a serial correla- 
tion between periods of high volatility and the next period. It is the volatility that is 
positively correlated without respect to direction of returns. 

This can be the case even though there is no simple linear correlation that would 
allow one to make money. Evidence in support of the GARCH models can be seen 
simply by looking at the correlation between the net return squared for this period 
and the net return squared for the next period. The squared terms are large for both 
large positive and large negative numbers. The original direction of the change is lost 
when the return data are squared. 


Ce 


The third technique for resampling is to randomize the samples by start date, but 
continue the analysis sequentially from there for a predetermined number of time 
periods, As an example, we could randomly select period ¢ and all stocks from that 
period and continue our analysis for 20 days from there. Alternatively, we could 


The Combined Optimal Portfolio Model 119 


perform our analysis from time ¢ backward for a similar window of, say, 20 days. 
This technique will preserve both the same period correlation structure of the data 
as well as any linear or nonlinear relationships hidden in the data. Generally speak- 
ing, this procedure is the best. choice of the three. 

However, remember that using overlapping periods in the analysis can cause prob- 
lems when not using robust statistics. Thus, the best practice is to perform the ro- 
bust resampling using a larger sample than normal. If the sample size is 1,000, then 
perhaps it should be 10,000 samples. Obviously, the issue involves the availability 
of computing resources, as well as other practical issues. If performed correctly, 
this form of resampling is the best and requires the fewest assumptions in order to 
implement. 


Unfortunately, there is a built-in conflict of interest between the money manager and the 
individual or ultimate investor. Examples of this abound. A case in point would be the 
manager whose incentive compensation is based on hitting a certain threshold perfor- 
mance level. Suppose the level is 10 percent. Once that level is reached, the manager has 
every incentive to cut risk and attempt to coast to the finish line for that period with the 
goal safely in hand. It can be demonstrated that when the market is up after three quar- 
ters, many fund managers actually reduce risk. From their standpoint, they may figure, 
“Why risk a good performance year right at the end?” So they reduce their risk in order 
to preserve gains. 

In another scenario, the manager who has locked in his bonus threshold for the year 
may decide to play it safe. He no longer has any incentive to take risks. In fact, added 
risk can only be punished. 

Suppose that the situation is slightly different. The manager has achieved a 9 percent 
gain for the period as the end approaches. His bonus increases if the return can be im- 
proved to 10 percent. Perhaps a compensation structure such as this might induce the 
manager to take larger risks as the end of the period nears. He is hoping that the wilder 
swings of the increased risk will carry him over the top to his desired goal. 

It is important to define what the appropriate goal for the portfolio is. It is the opinion 
of the author that any professional manager’s compensation should be based on and 
compatible with the putative goals of the investors. To the extent that this does not occur 
then there arises the possibility of a conflict of interest. However, one must recognize 
that the realities of marketing are ever present and inescapable. Attracting and retaining 
clients is essential to every capital management business. Thus, it is fair to say that a 
little of each must be part of the mix. 


120 OPTIMAL PORTFOLIO MODELING 


We also observe that the ideal professional manager will be compensated by a for- 
mula that reflects the underlying goals of his investors. In this light, the use of the log log 
utility function described in earlier chapters has much to recommend it. 

In particular, the iterated log formula is likely to be more in tune with the long-term 
goals of the investors. It properly weights the long-term compounded return and offers 
the added value of maximizing the utility of wealth in a fashion that most investors would 
choose for their own portfolios, The iterated log function also offers a conservative con- 
cave risk-averse function to properly reduce risk. 


About the CD-ROM. 


l 


This appendix provides you with information on the contents of the CD that accompanies 
this book. For the latest and greatest information, please refer to the ReadMe file located 
at the root of the CD. 


e A computer with a processor running at 120 Mhz or faster 

e At least 32 MB of total RAM installed on your computer, for best performance, we 
recommend at least 64 MB 

e A CD-ROM drive 


NOTE: Many popular spreadsheet programs are capable of reading Excel files. How- 
ever, users should be aware that a slight amount of formatting might be lost when using 
a program other than Microsoft Excel. 


Using the CD with Windows 


To install the items from the CD to your hard drive, follow these steps: 


1. Insert the CD into your computer’s CD-ROM drive. 


2. The CD-ROM interface will appear. The interface provides a simple point-and-click 
way to explore the contents of the CD. 


If the opening screen of the CD-ROM does not appear automatically, follow these 
steps to access the CD: 


1. Click the Start button on the left end of the taskbar and then choose Run from the 
menu that pops up. 


121 


122 ABOUT THE CD-ROM 


2. Inthe dialog box that appears, type d:\start.exe. (If your CD-ROM drive is not drive 
, fill in the appropriate letter in place of d.) This brings up the CD Interface described 
in the preceding set of steps. 


WHAT'S ON THE CD 


The following sections provide a summary of the software and other materials you'll 
find on the CD. The CD contains program examples which are intended to illustrate the 
subjects presented in the book. The examples are written in either Excel or the statistical 
programming language R. 


Content 


The CD is organized into two major categories: 


1. The Programs - which includes the Excel and R language programs which relate 
to the subjects covered in the text. There are numerous program and application 
examples which illustrate the points discussed. In addition the code for many of 
the graphics and figures used in the text is included as well to provide additional 
concrete examples. 


2. Data- This includes both sample empirical data as well as simulated stock prices for 
randomized models. 


The programs may be run from either the CD-ROM or the hard drive once the user 
has copied the files over to the hard drive. Generally it will prove more convenient to run. 
them from the hard drive. 


Using the Programs 


Excel For the purposes of this section it is assumed that the user has already installed 
Excel. For help on this the reader is referred to the documentation which came with the 
software package. 

Running the Excel spread sheets from within Excel is performed by loading the ap- 
propriate file from the folder. It is assumed that the use has basic knowledge of the 
operation of Excel, so this will not be discussed further here. 

To understand the spread sheets does not require knowledge of Visual Basic for 
Applications. It is not used. Only fundamental spread sheet knowledge is assumed. The 
purpose of the examples is to introduce the user to some of the more advanced features 


About the CD-ROM 123 


of Excel and to illustrate how they might be used for modeling applications. Special 
emphasis is on the optimization and statistical techniques discussed in the book. 


R Language BR is an open source statistical language which offers powerful functions 
with only a few user written commands. R offers math and statistical functions as well 
as powerful data manipulation and graphical plotting routines. 

Readers who have not installed R may wish to refer to the accompanying section 
entitled Installing R. It describes how to download R and install it. 

‘There are also several resources and references listed in the Installing R section. So 
even if the user has already installed the R package it may prove helpful to review this 
section as well. 


To Run the Programs on this CD-ROM 


in order to run the programs on this CD-ROM the user should first copy the entire con- 
tents of this CD onto his C drive in the folder called c:/Optimal which is short hand for 
Optimal Portfolio Modeling, the name of the book. 

This procedure may be carried out by entering the following command in the com- 
mand line on your Windows machine. 


xcopy D:** C:\Optimal\** \S 


The above assumes that your D: drive is your CD-ROM drive. 

There is a short cut to a batch file on this CD-ROM which will perform the same 
function. To execute this just double click on the file named ‘Install’ in the main directory 
of the CD-ROM. 


About Chapter 1: Loading Data 


The files on this CD-ROM pertaining to Chapter 1 are really more tutorials on how to load 
data into Excel and R. The instructions contained in the text files show how to download 
free stock price history data from Yahoo. 

The Excel spread sheet in Chapter 1 is an example of what the data should look like 
as of the date the spread sheet was written. 

Your spread should look similar but will be updated to the current date on which you 
download your data. 

The R program will download the data for a sample stock symbol. In order to down- 
load a different symbol just edit the symbol. 

In order to run the R code contained in the text file with extension .txt you simply 
copy and paste the code from the text file into the R Gui environment. Some packages 
may require loading of various library packages. If for some reason your environment 


124 ABOUT THE CD-ROM 


does not have the required package loaded simply click Packages then Load Package 
from the R menu. From there you will see a list of available packages. simply select the 
one which is required and the download will proceed automatically. 


About the Programs 


The programs are intended to show particular examples of very simple concepts which 
often arise in portfolio modeling. Each is meant to be short and very understandable if 
one is willing to read the code. 


About the Figure Programs 


The programs in this folder are examples of how the figures in the book were created. As 
such they are a very useful tutorial on how to handle certain types of data and functions 
discussed in the text. They also serve as illustrations of how to produce graphics output 
with a minimum of effort in both Excel and R. As before the emphasis is on brevity and 
ease of understanding. 


UPDATES 'T0 THE CD-ROM 


When and if there are updates to the CD-ROM and the programs associated with this text 
the author intends to publish them at his web site. The URL for the update page is: 


hittp:/www.pmecdonnell.com/OptimalPortfolioModeling 


Readers are encouraged to check there periodically for software updates and bug 
fixes. 


CUSTOMER CARE 


If you have trouble with the CD-ROM, please call the Wiley Product Technical Support 
phone number at (800) 762-2974. Outside the United States, call 1(317) 572-3994. You can 
also contact Wiley Product Technical Support at http://support.wiley.com. John Wiley 
& Sons will provide technical support only for installation and other general quality con- 
trol items. For technical support on the applications themselves, consult the program's 
vendor or author. 

To place additional orders or to request information about other Wiley products, 
please call (877) 762-2974. 


APPENDIX 


Probability Normal Probability 

0.005 —2,57583 0.120 

0.010 ~2,32634 0.125 

0.015 ~2.17009 0.130 

0.020 2.05375 0.135 

0.025 -1.95996 0.140 

0.030 —1.88079 0.145 

0.035 —1.81191 0.150 

0.040 —1.75069 0.155 

0.045 69540 0.160 —0.99446 

0.050 = 1.64485 0.165 -0.97411 

0.055 -1.59819 0.170 —0.95416 

0,060 =1.55477 0.175 93459 

0.065 =1.51410 0.180 91537 

0.070 =1.47579 0.185 0.89647 

0.075 —1.43953 0.190 ~0.87790 

0.080 1.40507 0.195 ~0.85962 

0.085 —1.37220 0.200 —0.84162 

0.090 —1.34075 0.205 —0.82389 

0.095 —1.31058 0.210 —0.80642 

0.100 =1.28155 0.215 78919 

0.105 =1,25357 0.220 -0.77219 

0.110 1.22653 0.225 -0.75541 

0.115 =1.20036 0.230 ~0.73885 
(continued) 


125 


126 OPTIMAL PORTFOLIO MODELING 


Probability Normal Probability Normal 
0.235 -0.72248 0.455 =0,11304 
0.240 ~0.70630 0.460 —0,10043 
0.245 —0.69031 0.465 —0,08784 
0.250 —0.67449 0.470 —0,07527 
0.255 —0.65884 0.475 —0,0627) 
0.260 -0.64334 0.480 —0,05015 
0.265 —0.62801 0.485 —0.03761 
0.270 -0.61281 0.490 —0,02507 
0.275 -0.59776 0.495 —0.01253 
0.280 —0.58284 0.500 0.00000 
0.285 -0.56805 0.505 0.01253 
0.290 —0.55338 0.510 0.02507 
0.295 ~0,53884 0.515 0.03761 
0.300 0.52440 0.520 0.05015 
0.305 0.51007 0.525 0.06271) 
0.310 —0.49585 0,530 0.07527 
0.315 —0.48173 0.535 0.08784 
0.320 —0.46770 0.540 0.10043 
0.325 -0.45376 0.545 0.11304 
0.330 -0.43991 0.550 0.12566 
0.335 -0.42615 0.555 0.13830 
0.340 -0.41246 0.560 0.15097 
0.345 -0.39886 0.565 0.16366 
0.350 -0.38532 0.570 0.17637 
0.355 -0,37186 0.575 0.18912 
0.360 —0,35846 0.580 0.20189 
0.365 ~0.34513 0.585 0.21470 
0.370 =0,33185 0.590 0.22755 
0.375 —0.31864 0.595 0.24043 
0.380 —0.30548 0.600 0.25335 
0.385 0.29238 0.605 0.26631 
0.390 —0,27932 0.610 0.27932 
0.395 —0,26631 0.615 0.29238 
0.400 —0.25335 0.620 0.30548 
0.405 —0.24043 0.625 0.31864 
0.410 —0.22755 0.630 0.33185 
0.415 -0.21470 0.635 0.34513 
0.420 -0.20189 0.640 0.35846 
0.425 —0.18912 0.645 0.37186 
0.430 —0.17637 0.650 0.38532 
0.435 0.16366 0.655 0.39886 
0.440 -0.15097 0.660 0.41246 
0.445 —0.13830 0.665 0.42615 


0.450 —0.12566 0.670 0.43991 


Appendix 7 127 


Probability Normal Normal 
0.675 0.45376 0.840 0.99446 
0.680 0.46770 0.845 1.01522 
0.685 0.48173 0.850 1.03643 
0.690 0.49585 0.855 1.05812 
0.695 0.51007 0.860 1.08032 
0.700 0.52440 0.865 1.10306 
0.705 0.53884 0.870 1.12639 
0.710 0.55338 0.875 1.15035 
0.715 0.56805 0.880 1.17499 
0.720 0.58284 0.885 1.20036 
0.725 0.59776 0.890 1.22653 
0.730 0.61281 0.895 1.25357 
0.735 0.62801 0.900 1.28155 
0.740 0.64334 0.905 1.31058 
0.745 0.65884 0.910 1.34075 
0.750 0.67449 0.915 1.37220 
0.755 0.69031 0.920 1.40507 
0.760 0.70630 0.925 1.43953 
0.765 0.72248 0.930 1.47579 
0.770 0.73885 0.935, 1.51410 
0.775 0.75541 0.940 1.55477 
0.780 0.77219 0.945 1.59819 
0.785 0.78919 0.950 1.64485 
0.790 0.80642 0.955 1.69540 
0.795 0.82389 0.960 1.75069 
0.800 0.84162 0.965 1.81191 
0.805 0.85962 0.970 1.88079 
0.810 0.87790 0.975 1.95996 
0.815 0.89647 0.980 2.05375 
0.820 0.91537 0.985 2.17009 
0.825 0.93459 0.990 2.32634 
0.830 0.95416 0.995, 2.57583 
0.835 0.97411 


The table was produced on a spreadsheet that is available on the CD-ROM accompa- 
nying this volume. It illustrates the use of the normsinv() function to provide the cumu- 
lative values of the normal distribution function. 

Readers will note that the distribution from .005 to .500 is the mirror image of the 
distribution from .500 to .995. This property illustrates the reflection principle in a very 
intuitive way. 


NDIX 2 


whose readers who have not installed the statistical language R can do so with the 
information provided in this section. 

The R Project home page offers many helpful links to R resources and is generally 
sdered the reference source of choice for most information on R. The Web site may 
nced from the following URL link: 


http: //www.r-project .org/ 
The latest copy of the R language can be downloaded at no charge from the following 
page: 

http: //www.r-project .org/ 
Simply select a download mirror site near you for the fastest possible download. 
located in the United States will need to scroll way down the page to locate the 

download sites based in the United States. 

Installation and administration of your R installation is discussed in a downloadable 


nt available from the Web site just given. 
‘The latest copy of the R language manual is available at the following Web page: 


http: //cran.r-project .org/doc/manuals/R-lang.pdf 


129 


his appendix provides an introduction to the statistical and graphics language R. R 
is a freely available language that is based on the commercial language S. Programs 
written in either language are virtually identical to programs written in the other. 
‘Therefore, users can usually run S programs unchanged in the free R environment. 
same fashion, R programs will usually run unchanged in an S environment. 

‘The developers of R have created the language as a labor of love. The software is 
available, and the developers have worked to provide the benefits of R to the world 
asking any compensation. It is a quintessential example of freely available soft- 
at its best. 

The following document provides an introduction to programming in the R language. 
provided as a way to contribute to the R community and as a way to spread the 
about R. 

The document itself is presented as a verbatim copy of the original. Full credit is 
to the authors. In doing so, we have relied on and reproduced the permissions of the 
given in the first few sentences. 


adverse selection, 108 

alpha, 44, 111 
persistence of, 46 

arbitrage pricing theory, 

2,10 

Arrow, Kenneth, 83 

Ask, 7 

author’s motto, 20 

auto regression, 22 

autocorrelation, 99 

average true range, 63 


Bachelier, Louis, 3-4 
Beat the Dealer, 78 
bell shaped curve, 15 
Bernoulli, Daniel, 37, 82-83 
proposed log utility, 37 
Bernoulli, Nicolas, 37 
beta, 44, ILL 
persistence of, 46 
bid, 7 
bid-ask bounce, 24 
Black Scholes model, 115 
Black, Fisher, 10 
Brownian Motion, 4 
Burgi, Joost, 82 


calls, similar to stop loss, 65 

Capital Asset Pricing Model, 
45, 46, 95-96 

CAPM, see Capital Asset 
Pricing Model 

cdf normal, 18 

CD-ROM, I1 

Churchill, Winston, 51-52 


Index 


compensation, manager 
behavior at thresholds, 
120 
correlation, 76, 96, 99 
coefficient, definition, 42 
lagged, 99 
non-linear, 99 
of squared returns, 118 
Covariance, defined, 42 
effect on variance, 95 
Cramer, Gabriel, 82 
cumulative density function, 
16 
normal, 17 


data adjustments, 109 
adjusting for holidays, 111 
mining, 64 
net changes, 103 
price levels, 103 
quality, 102 
round off errors in, 109 
sample bias, 106 
synchronization, 103 
time stamping, 106 

Debreu, G., 83 

distribution 
advantages of empirical, 

69, 70 
binomial, 14 
choice of, 114 
continuous theoretical, 94 
definition, 14 
empirical, 23-24, 75, 96, 
114-115 


empirical advantages of, 
114-115 
empirical disadvantages 
of, 114 
empirical disadvantages 
of, 116 
empirical probability 
density function 
(pdt, 24 
log normal, 19-20 
lognormal as multiplicative 
model, 19, 61 
non-parametric, 24 
normal, 15-16 
normal, as additive model, 
61 
parametric, 24 
probability, 14 
Student, 40 
tails, fat tails, 22, 114 
theoretical advantages of, 
114 
theoretical disadvantages, 
114 
theoretical, 114 
diversification, effect of, 81 
drawdown, 78 
drift, 6 


e, 33, 37 

Efficient frontier, 46-47 

efficient market hypothesis, 
25,9 

EGARCH, 23 

Einstein, Albert, 3-4 


295 


296 


EMH, father of, 4 
errors tend to be large, 106 
ETF, exchange traded fund, 
24 
Buler, ¢, 21 
Euler, Leonhard, 37 
Excel, 1, 105 
Excel, Solver, 86, 90 
installing, 91 
operation, 90 
expectation, 29 


fair game, 30 

Fisher, R.A., 40 

fundamental data, retroactive 
adjustment of, 110 


Gambler's Ruin, 31 

gap opening, 63 

GARCH, 22-28, 48, 112 
sequential correlation, 118 

Google, 108 

Gosset, William Sealy, 40 


Hamilton, 117 

heteroskedasticity, 22, 104 

highs, susceptible to errors, 
108 


independent identically 
distributed, 20, 41 

infinite variance, 23 

information from the past, 
105 

innovation, 16 

iterated log function, 117 

iterated log model, concavity 
of, 98, 99 


Kahnemann, Daniel, 34, 37 
Kepler, Johannes, 82 
knowledge of the future, 105 
kurtosis, 48 


law of eponymy, 4 
log log model, 95, 97 
Jog log utility, 120 


log normal cumulative 
density function, 21 

logarithm, 21 

logarithms, 32-33 

lognormal, approximation to 
the empirical, 26-27 

lows, susceptible to errors, 
108 


Mandelbrot, Benoit, 22 
market makers, 9 
Markowitz, Harry M., 42, 46 
maximum adverse excursion, 
62 
implications of, 84 
superiority of, 84 
wait time, 84 
maximum investment 
formula, 77 
maximum return model, 73 
limitations, 74 
mean, 16, 19, 111 
median, 16 
model, accuracy with highs 
and lows, 108 
simulation, 118 
backtesting, 102 
Models, non-linear, 100 
with autocorrelation, 100 
Modern portfolio theory, 42, 
48 
Morgenstern, Oscar, 83 
MPT, see Modern 
PortfolioTheory 
multiplicative model, see 
distribution, log normal 


Napier, John, 81-82 
Napier, Napier’s Bones, 82 
New York Stock Exchange, 13 
normal, approximation to the 
empirical distribution, 
26-27 
characterized by mean, 
variance, 17 
rational approximation, 
18-19 


INDEX 


objective function, 68 
choice of, 116 
log log, 116 
Odean, Terrance, 2 
Optimization, in Excel, 87-88 
in R, 86 
order, limit, 7 
market, 7 
orders, impact on models, 
107 
stop, 108 
stop-loss definition, 52 
Osborne, M.F.M., 4, 5, 13 
overfitting, 64 


Paradox, St. Petersburg, 
36-37 

portfolio, market portfolio as 
optimal, 46-47 

portfolio, optimal, 42 

portfolio, optimal weightings, 
90 

position sizing as risk 
control, 71. 

Pratt, G., 84 

private investor, 119 

probability function, normal, 
18 

Probability of Rise, 2-3 

probability of win, 111 

professional money manger, 
19 

profit target, summary, 64-65 

profit targets, 51, 64 

Prospect Theory, 34, 36-37 

puts, analog to profit targets, 
65-66 


R language, 1, 105 

R, optim routine, 88 

R, optimizer routines, 88 

random sampling, 
with correlation, 100 
with look-back windows, 

101 
random walk, 2-3 
random, definition, 7 


Index 


random, innovation, 17 
Reflection Principle, 13, 16, 
51,53 
regression, log S&P, 5-6 
relative return, 68-69 
resample, multivariate 
distribution, 77 
resample, windows, 77 
resampling 
by random dates, 118 
by random windows, 119 
by windows, 118 
return, average compound, 33 
Retum, average daily, 2-3 
retum, excess, 48 
returns, average, 70 
compound portfolio 
returns, 70 
risk, 51 
defining, 38 
estimation, 39 
market, 44 
systematic, 44 
minimum risk models, 41 
robust methods, applicable 
to, 115 
robust resampling, 112, 115 
robust statistics, confidence 
limits, 114 
Ross, Stephen, 10 
Rubinstein, Mark, 84 


Samuelson, Paul, 3 
scientific method, 23 
self-attribution fallacy, 6 
milarity, 13, 15 
semi-standard deviation, 40 
semi-variance, 40 
serial correlation, 22, 99 
Shannon, Claude, 117 
Sharpe ratio, 47, 81, 85, 
95-96, 111, 116 
interpretation, 48 
optimal model, 86 
optimization, 89, 94 


Sharpe, William F., 44, 95 
skew, 48 
slide rule, 82 
slippage, 63, 108 
Sortino Ratio, 111 
St. Petersberg Paradox, 
81-82 
stale pricing, 24 
standard deviation, 2, 3, 
i 
Statistics on the Table, 107 
Stigler, Stephen M., 4, 40, 
106-107 
Jaw of eponymy, 40 
stock selection, 1 
stock splits, adjusting for, 
109 
stop, 9 
stop profit, 51, 64 
stop-loss, 51 
stops, combined with profit 
targets, 65 
double probability, 58 
effect on kurtosis, 60 
effect on mean, 53 
effect on skew, 59-60 
effect on variance and 
standard deviation, 
58-59 
generate two 
commissions, 55 
modeling empirically, 63 
normal distribution, 54 
optimizing, 63 
path dependence, 61 
probability of being at stop 
doubles, 57 
reduce probability of gain, 
56 
similar to puts and calls, 
52,65 
slippage, 62 
summary of effects, 61 
time out of market as a 
cost, 56 


story, finance prof, 
economist, trader, 9 
studies, non-predictive, 106 
studies, predictive, 106 
study statistics, 111 
sum of random variables, 41 
symmetric distribution, 16 
symmetry, 15 


t test, 40 

Theory of Games, 83 
Thorpe, Edwin O., 78 
Tobin, James, 44 
Tversky, Amos, 34, 37 


utility, 80 
utility of wealth, 117 
based on mean variance, 
48 
concave, 83 
function, 80 
iterated logarithms, 84 
log, 35-37 
log log, 84 
logarithmic, 48, 83-34 
maximum compounded 
utility, $4 
model, 81 
monotonically increasing 
83 
square root function, 82 
utils, units of utility, 84 


variance, 22 
Variance of correlated 
variables, 42 
of random variables, 41 
formula for with 
covariance, 94 
infinite, 22 
Von Neumann, Jon Von, 83 


Wall Street, 1 


Yahoo, 108 


